Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 42 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 17 tok/s Pro
GPT-5 High 13 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 217 tok/s Pro
GPT OSS 120B 474 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

Covariance Adaptive Best Arm Identification (2306.02630v2)

Published 5 Jun 2023 in stat.ML and cs.LG

Abstract: We consider the problem of best arm identification in the multi-armed bandit model, under fixed confidence. Given a confidence input $\delta$, the goal is to identify the arm with the highest mean reward with a probability of at least 1 -- $\delta$, while minimizing the number of arm pulls. While the literature provides solutions to this problem under the assumption of independent arms distributions, we propose a more flexible scenario where arms can be dependent and rewards can be sampled simultaneously. This framework allows the learner to estimate the covariance among the arms distributions, enabling a more efficient identification of the best arm. The relaxed setting we propose is relevant in various applications, such as clinical trials, where similarities between patients or drugs suggest underlying correlations in the outcomes. We introduce new algorithms that adapt to the unknown covariance of the arms and demonstrate through theoretical guarantees that substantial improvement can be achieved over the standard setting. Additionally, we provide new lower bounds for the relaxed setting and present numerical simulations that support their theoretical findings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Best arm identification in multi-armed bandits. In Conference on Learning Theory, 2010.
  2. Regret in online combinatorial optimization. Mathematics of Operations Research, 39(1):31–45, 2014.
  3. Tuning bandit algorithms in stochastic environments. In International conference on algorithmic learning theory, pages 150–165. Springer, 2007.
  4. Sequential nonparametric testing with the law of the iterated logarithm. In Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, pages 42–51, 2016.
  5. Pierre C Bellec. Concentration of quadratic forms under a Bernstein moment assumption. arXiv preprint arXiv:1901.08736, 2019.
  6. F-race and iterated F-race: An overview. Experimental methods for the analysis of optimization algorithms, pages 311–336, 2010.
  7. Leveraging side observations in stochastic bandits. In UAI, 2012.
  8. Tight (lower) bounds for the fixed budget best arm identification bandit problem. In Conference on Learning Theory, pages 590–604. PMLR, 2016.
  9. Combinatorial bandits. Journal of Computer and System Sciences, 78(5):1404–1422, 2012.
  10. Combinatorial multi-armed bandit: General framework and applications. In International conference on machine learning, pages 151–159. PMLR, 2013.
  11. PAC bounds for multi-armed bandit and Markov decision processes. In International Conference on Computational Learning Theory, pages 255–270. Springer, 2002.
  12. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of machine learning research, 7(6), 2006.
  13. Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations. IEEE/ACM Transactions on Networking, 20(5):1466–1478, 2012.
  14. Optimal best arm identification with fixed confidence. In Conference on Learning Theory, pages 998–1027. PMLR, 2016.
  15. Best-arm identification in correlated multi-armed bandits. IEEE Journal on Selected Areas in Information Theory, 2(2):549–563, 2021.
  16. Lil’ucb: An optimal exploration algorithm for multi-armed bandits. In Conference on Learning Theory, pages 423–439. PMLR, 2014.
  17. Dealing with unknown variances in best-arm identification. In International Conference on Algorithmic Learning Theory, pages 776–849. PMLR, 2023.
  18. Improper analysis of trials randomised using stratified blocks or minimisation. Statistics in medicine, 31(4):328–340, 2012.
  19. PAC subset selection in stochastic multi-armed bandits. In ICML, volume 12, pages 655–662, 2012.
  20. On the complexity of best-arm identification in multi-armed bandit models. The Journal of Machine Learning Research, 17(1):1–42, 2016.
  21. Most correlated arms identification. In Conference on Learning Theory, pages 623–637. PMLR, 2014.
  22. The sample complexity of exploration in the multi-armed bandit problem. Journal of Machine Learning Research, 5(Jun):623–648, 2004.
  23. Hoeffding races: Accelerating model selection search for classification and function approximation. Advances in neural information processing systems, 6, 1993.
  24. Empirical Bernstein bounds and sample-variance penalization. In COLT 2009 - The 22nd Conference on Learning Theory, Montreal, Quebec, Canada, June 18-21, 2009, 2009.
  25. Empirical Bernstein stopping. In Proceedings of the 25th international conference on Machine learning, pages 672–679, 2008.
  26. Efficient algorithms for minimizing cross validation error. In Machine Learning Proceedings 1994, pages 190–198. Elsevier, 1994.
  27. Covariance-adapting algorithm for semi-bandits with application to sparse outcomes. In Conference on Learning Theory, pages 3152–3184. PMLR, 2020.
  28. Fast rates for prediction with limited expert advice. Advances in Neural Information Processing Systems, 34, 2021.
  29. Alexandre B Tsybakov. Optimal rates of aggregation. In Learning theory and kernel machines, pages 303–313. Springer, 2003.
  30. Nicolas Verzelen. Minimax risks for sparse regressions: Ultra-high dimensional phenomenons. Electronic Journal of Statistics, 6:38 – 90, 2012.
  31. Martin J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge University Press, 2019.

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube