Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 52 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 18 tok/s Pro
GPT-5 High 13 tok/s Pro
GPT-4o 100 tok/s Pro
Kimi K2 192 tok/s Pro
GPT OSS 120B 454 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Stochastic Optimization with Constraints: A Non-asymptotic Instance-Dependent Analysis (2404.00042v1)

Published 24 Mar 2024 in math.OC, cs.AI, cs.LG, and stat.ML

Abstract: We consider the problem of stochastic convex optimization under convex constraints. We analyze the behavior of a natural variance reduced proximal gradient (VRPG) algorithm for this problem. Our main result is a non-asymptotic guarantee for VRPG algorithm. Contrary to minimax worst case guarantees, our result is instance-dependent in nature. This means that our guarantee captures the complexity of the loss function, the variability of the noise, and the geometry of the constraint set. We show that the non-asymptotic performance of the VRPG algorithm is governed by the scaled distance (scaled by $\sqrt{N}$) between the solutions of the given problem and that of a certain small perturbation of the given problem -- both solved under the given convex constraints; here, $N$ denotes the number of samples. Leveraging a well-established connection between local minimax lower bounds and solutions to perturbed problems, we show that as $N \rightarrow \infty$, the VRPG algorithm achieves the renowned local minimax lower bound by H`{a}jek and Le Cam up to universal constants and a logarithmic factor of the sample size.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. A framework for estimation of convex functions. Statistica Sinica, pages 423–456, 2015.
  2. Asymptotic normality and optimality in nonsmooth stochastic approximation. arXiv preprint arXiv:2301.06632, 2023.
  3. Implicit functions and solution mappings: A view from variational analysis, volume 616. Springer, 2009.
  4. Local minimax complexity of stochastic convex optimization. Advances in Neural Information Processing Systems, 29, 2016.
  5. J. C. Duchi and F. Ruan. Asymptotic optimality in stochastic optimization. 2021.
  6. J. Dupacová and R. Wets. Asymptotic behavior of statistical estimators and of optimal solutions of stochastic optimization problems. The annals of statistics, 16(4):1517–1549, 1988.
  7. J. Hájek. Local asymptotic minimax and admissibility in estimation. In Proceedings of the sixth Berkeley symposium on mathematical statistics and probability, volume 1, pages 175–194, 1972.
  8. J.-B. Hiriart-Urruty and C. Lemaréchal. Convex analysis and minimization algorithms I: Fundamentals, volume 305. Springer science & business media, 1996.
  9. R. Johnson and T. Zhang. Accelerating stochastic gradient descent using predictive variance reduction. Advances in neural information processing systems, 26, 2013.
  10. Is temporal difference learning optimal? an instance-dependent analysis. SIAM Journal on Mathematics of Data Science, 3(4):1013–1040, 2021.
  11. Instance-optimality in optimal value estimation: Adaptivity via variance-reduced q-learning. arXiv preprint arXiv:2106.14352, 2021.
  12. A. J. King. Asymptotic behaviour of solutions in stochastic optimization: nonsmooth analysis and the derivation of non-normal limit distributions (least squares). 1986.
  13. Asymptotic theory for solutions in statistical estimation and stochastic programming. Mathematics of Operations Research, 18(1):148–162, 1993.
  14. L. Le Cam et al. Limits of experiments. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 245–261. University of California Press, 1972.
  15. Asymptotics in statistics: some basic concepts. 2000.
  16. Y. Nesterov. Primal-dual subgradient methods for convex problems. Mathematical programming, 120(1):221–259, 2009.
  17. R. Poliquin and R. T. Rockafellar. Tilt stability of a local minimum. SIAM Journal on Optimization, 8(2):287–299, 1998.
  18. A. Shapiro. Asymptotic properties of statistical estimators in stochastic programming. The Annals of Statistics, 17(2):841–858, 1989.
  19. A. W. Van der Vaart. Asymptotic statistics, volume 3. Cambridge university press, 2000.
  20. M. J. Wainwright. Stochastic approximation with cone-contractive operators: Sharp ℓ∞subscriptℓ\ell_{\infty}roman_ℓ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT-bounds for q𝑞qitalic_q-learning. arXiv preprint arXiv:1905.06265, 2019.
  21. M. J. Wainwright. Variance-reduced q𝑞qitalic_q-learning is minimax optimal. arXiv preprint arXiv:1906.04697, 2019.
  22. J. Wellner et al. Weak convergence and empirical processes: with applications to statistics. Springer Science & Business Media, 2013.
  23. L. Xiao and T. Zhang. A proximal stochastic gradient method with progressive variance reduction. SIAM Journal on Optimization, 24(4):2057–2075, 2014.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.

Authors (1)

X Twitter Logo Streamline Icon: https://streamlinehq.com