An important concept in online learning and convex optimization is that of
strong convexity: a twice-differentiable function $f$ is said to be strongly
convex with respect to a norm $\|\cdot\|$ if
$z^T\
Here's a fun counterexample: a function $\mathbb{R}^n \to \mathbb{R}$ that is
jointly convex in any $n-1$ of the variables, but not in all variables at once.
The function
I spent the last several hours trying to come up with an efficient algorithm to
the following problem:
Problem:Suppose that we have a sequence of $l$ pairs of non-negative numbers
$(a_1,
While grading homeworks today, I came across the following bound:
Theorem 1: If A and B are symmetric $n\times n$ matrices with eigenvalues
$\lambda_1 \geq \lambda_2 \geq \ldots \geq \lambda_
The KL divergence is an important tool for studying the distance between two
probability distributions. Formally, given two distributions $p$ and $q$, the KL
divergence is defined as
$KL(p || q) := \int p(