Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 45 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 11 tok/s Pro
GPT-5 High 19 tok/s Pro
GPT-4o 88 tok/s Pro
Kimi K2 214 tok/s Pro
GPT OSS 120B 460 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Faster Game Solving via Hyperparameter Schedules (2404.09097v1)

Published 13 Apr 2024 in cs.GT

Abstract: The counterfactual regret minimization (CFR) family of algorithms consists of iterative algorithms for imperfect-information games. In two-player zero-sum games, the time average of the iterates converges to a Nash equilibrium. The state-of-the-art prior variants, Discounted CFR (DCFR) and Predictive CFR$+$ (PCFR$+$) are the fastest known algorithms for solving two-player zero-sum games in practice, both in the extensive-form setting and the normal-form setting. They enhance the convergence rate compared to vanilla CFR by applying discounted weights to early iterations in various ways, leveraging fixed weighting schemes. We introduce Hyperparameter Schedules (HSs), which are remarkably simple yet highly effective in expediting the rate of convergence. HS dynamically adjusts the hyperparameter governing the discounting scheme of CFR variants. HSs on top of DCFR or PCFR$+$ is now the new state of the art in solving zero-sum games and yields orders-of-magnitude speed improvements. The new algorithms are also easy to implement because 1) they are small modifications to the existing ones in terms of code and 2) they require no game-specific tuning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Blackwell, D. An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics, pp.  1–8, 1956.
  2. Heads-up limit hold’em poker is solved. Science, pp.  145–149, 2015.
  3. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals. Science, pp.  418–424, 2018.
  4. Solving imperfect-information games via discounted regret minimization. In AAAI Conference on Artificial Intelligence (AAAI), 2019a.
  5. Superhuman AI for multiplayer poker. Science, pp.  885–890, 2019b.
  6. Prediction, learning, and games. Cambridge University Press, 2006.
  7. Correlation in extensive-form games: Saddle-point formulation and benchmarks. Conference on Neural Information Processing Systems (NeurIPS), 2019.
  8. Faster game solving via predictive Blackwell approachability: Connecting regret matching and mirror descent. In AAAI Conference on Artificial Intelligence (AAAI), 2021.
  9. General Blotto: games of allocative strategic mismatch. Public Choice, 2009.
  10. A simple adaptive procedure leading to correlated equilibrium. Econometrica, pp.  1127–1150, 2000.
  11. Kuhn, H. W. A simplified two-person poker. Contributions to the Theory of Games, pp.  97–103, 1950.
  12. Monte Carlo sampling for regret minimization in extensive games. Conference on Neural Information Processing Systems (NeurIPS), 2009.
  13. Openspiel: A framework for reinforcement learning in games. arXiv preprint arXiv:1908.09453, 2019.
  14. Online Monte Carlo counterfactual regret minimization for search in imperfect information games. In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2015.
  15. ESCHER: Eschewing importance sampling in games by computing a history value function to estimate regret. In International Conference on Learning Representations (ICLR), 2023.
  16. Deepstack: Expert-level artificial intelligence in heads-up no-limit poker. Science, pp.  508–513, 2017.
  17. Nash, J. Equilibrium points in n-person games. Proceedings of the National Academy of Sciences, pp.  48–49, 1950.
  18. A course in game theory. MIT Press, 1994.
  19. Ross, S. M. Goofspiel—the game of pure strategy. Journal of Applied Probability, pp.  621–625, 1971.
  20. Bayes’ bluff: Opponent modelling in poker. In Annual Conference on Uncertainty in Artificial Intelligence (UAI), 2005.
  21. Steinberger, E. PokerRL. https://github.com/TinkeringCode/PokerRL, 2019.
  22. Tammelin, O. Solving large imperfect information games using CFR+. arXiv preprint arXiv:1407.5042, 2014.
  23. AutoCFR: Learning to design counterfactual regret minimization algorithms. In AAAI Conference on Artificial Intelligence (AAAI), 2022.
  24. Dynamic discounted counterfactual regret minimization. In International Conference on Learning Representations (ICLR), 2024.
  25. Equilibrium finding in normal-form games via greedy regret minimization. In AAAI Conference on Artificial Intelligence (AAAI), 2022.
  26. Regret minimization in games with incomplete information. Conference on Neural Information Processing Systems (NeurIPS), 2007.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com