Geometry, Computation, and Optimality in Stochastic Optimization (1909.10455v3)
Abstract: We study computational and statistical consequences of problem geometry in stochastic and online optimization. By focusing on constraint set and gradient geometry, we characterize the problem families for which stochastic- and adaptive-gradient methods are (minimax) optimal and, conversely, when nonlinear updates -- such as those mirror descent employs -- are necessary for optimal convergence. When the constraint set is quadratically convex, diagonally pre-conditioned stochastic gradient methods are minimax optimal. We provide quantitative converses showing that the ``distance'' of the underlying constraints from quadratic convexity determines the sub-optimality of subgradient methods. These results apply, for example, to any $\ell_p$-ball for $p < 2$, and the computation/accuracy tradeoffs they demonstrate exhibit a striking analogy to those in Gaussian sequence models.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.