# math

An important concept in online learning and convex optimization is that of strong convexity: a twice-differentiable function $f$ is said to be strongly convex with respect to a norm $\|\cdot\|$ if $z^T\ Here's a fun counterexample: a function$\mathbb{R}^n \to \mathbb{R}$that is jointly convex in any$n-1$of the variables, but not in all variables at once. The function is$f(
I spent the last several hours trying to come up with an efficient algorithm to the following problem: Problem: Suppose that we have a sequence of $l$ pairs of non-negative numbers $(a_1, While grading homeworks today, I came across the following bound: Theorem 1: If A and B are symmetric$n\times n$matrices with eigenvalues$\lambda_1 \geq \lambda_2 \geq \ldots \geq \lambda_
The KL divergence is an important tool for studying the distance between two probability distributions. Formally, given two distributions $p$ and $q$, the KL divergence is defined as \$KL(p || q) := \int p(