Adaptive Online Learning with Varying Norms
(2002.03963)Abstract
Given any increasing sequence of norms $|\cdot|0,\dots,|\cdot|{T-1}$, we provide an online convex optimization algorithm that outputs points $wt$ in some domain $W$ in response to convex losses $\ellt:W\to \mathbb{R}$ that guarantees regret $RT(u)=\sum{t=1}T \ellt(wt)-\ellt(u)\le \tilde O\left(|u|{T-1}\sqrt{\sum{t=1}T |gt|{t-1,\star}2}\right)$ where $gt$ is a subgradient of $\ellt$ at $wt$. Our method does not require tuning to the value of $u$ and allows for arbitrary convex $W$. We apply this result to obtain new "full-matrix"-style regret bounds. Along the way, we provide a new examination of the full-matrix AdaGrad algorithm, suggesting a better learning rate value that improves significantly upon prior analysis. We use our new techniques to tune AdaGrad on-the-fly, realizing our improved bound in a concrete algorithm.
We're not able to analyze this paper right now due to high demand.
Please check back later (sorry!).
Generate a summary of this paper on our Pro plan:
We ran into a problem analyzing this paper.