Papers
Topics
Authors
Recent
2000 character limit reached

LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models (2404.01230v1)

Published 1 Apr 2024 in cs.CL

Abstract: This paper presents a comprehensive survey of the current status and opportunities for LLMs in strategic reasoning, a sophisticated form of reasoning that necessitates understanding and predicting adversary actions in multi-agent settings while adjusting strategies accordingly. Strategic reasoning is distinguished by its focus on the dynamic and uncertain nature of interactions among multi-agents, where comprehending the environment and anticipating the behavior of others is crucial. We explore the scopes, applications, methodologies, and evaluation metrics related to strategic reasoning with LLMs, highlighting the burgeoning development in this area and the interdisciplinary approaches enhancing their decision-making performance. It aims to systematize and clarify the scattered literature on this subject, providing a systematic review that underscores the importance of strategic reasoning as a critical cognitive capability and offers insights into future research directions and potential improvements.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (94)
  1. Llm-deliberation: Evaluating llms with interactive multi-agent negotiation games. arXiv preprint arXiv:2309.17234, 2023.
  2. Evaluating multi-agent coordination abilities in large language models. arXiv preprint arXiv:2310.03903, 2023.
  3. Ina: An integrative approach for enhancing negotiation strategies with reward-based dialogue agent. In The 2023 Conference on Empirical Methods in Natural Language Processing, 2023.
  4. Playing repeated games with large language models. arXiv preprint arXiv:2305.16867, 2023.
  5. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine, 34(6):26–38, 2017.
  6. Playing games with gpt: What can we learn about a large language model from canonical strategic games? Available at SSRN 4493398, 2023.
  7. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  8. A survey of monte carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in games, 4(1):1–43, 2012.
  9. Put your money where your mouth is: Evaluating strategic planning and execution of llm agents in an auction arena. arXiv preprint arXiv:2310.05746, 2023a.
  10. Llmarena: Assessing capabilities of large language models in dynamic multi-agent environments. arXiv preprint arXiv:2402.16499, 2024.
  11. The emergence of economic rationality of gpt. Proceedings of the National Academy of Sciences, 120(51):e2316205120, 2023b.
  12. Adrian de Wynter. Will gpt-4 run doom? arXiv preprint arXiv:2403.05468, 2024.
  13. Gtbench: Uncovering the strategic reasoning limitations of llms via game-theoretic evaluations. arXiv preprint arXiv:2402.12348, 2024.
  14. Human-level play in the game of diplomacy by combining language models with strategic reasoning. Science, 378(6624):1067–1074, 2022.
  15. Can large language models serve as rational players in game theory? a systematic analysis. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pp.  17960–17967, 2024.
  16. Alphazero-like tree-search can guide large language model decoding and training. arXiv preprint arXiv:2309.17179, 2023.
  17. Chessgpt: Bridging policy learning and language modeling. Advances in Neural Information Processing Systems, 36, 2024.
  18. Limits of large language models in debating humans. arXiv preprint arXiv:2402.06049, 2024.
  19. Improving language model negotiation with self-play and in-context learning from ai feedback. arXiv preprint arXiv:2305.10142, 2023.
  20. Strategic reasoning with language models. arXiv preprint arXiv:2305.19165, 2023.
  21. Understanding social reasoning in language models with language models. Advances in Neural Information Processing Systems, 36, 2024.
  22. Large language models empowered agent-based modeling and simulation: A survey and perspectives. arXiv preprint arXiv:2312.11970, 2023.
  23. States as strings as strategies: Steering language models with game-theoretic solvers. arXiv preprint arXiv:2402.01704, 2024.
  24. Multi-agent deep reinforcement learning: a survey. Artificial Intelligence Review, 55(2):895–943, 2022.
  25. Fulin Guo. Gpt in game theory experiments. 2023.
  26. Can large language models play games? a case study of a self-play approach. arXiv preprint arXiv:2403.05632, 2024a.
  27. Suspicion-agent: Playing imperfect information games with theory of mind aware gpt-4. arXiv preprint arXiv:2309.17277, 2023.
  28. Large language model based multi-agents: A survey of progress and challenges. arXiv preprint arXiv:2402.01680, 2024b.
  29. " guinea pig trials" utilizing gpt: A novel smart agent-based modeling approach for studying firm competition and collusion. arXiv preprint arXiv:2308.10974, 2023.
  30. Trueskillâ„¢: a bayesian skill rating system. Advances in neural information processing systems, 19, 2006.
  31. John J Horton. Large language models as simulated economic agents: What can we learn from homo silicus? Technical report, National Bureau of Economic Research, 2023.
  32. Pok\\\backslash\’ellmon: A human-parity agent for pok\\\backslash\’emon battles with large language models. arXiv preprint arXiv:2402.01118, 2024.
  33. War and peace (waragent): Large language model-based multi-agent simulation of world wars. arXiv preprint arXiv:2311.17227, 2023.
  34. Assistive large language model agents for socially-aware negotiation dialogues. arXiv preprint arXiv:2402.01737, 2024.
  35. Pokergpt: An end-to-end lightweight solver for multi-player texas hold’em via large language model. arXiv preprint arXiv:2401.06781, 2024a.
  36. How far are we on the decision-making of llms? evaluating llms’ gaming ability in multi-agent environments. arXiv preprint arXiv:2403.11807, 2024b.
  37. Multi-agent reinforcement learning: A comprehensive survey. arXiv preprint arXiv:2312.10256, 2023.
  38. Philip N Johnson-Laird. Deductive reasoning. Annual review of psychology, 50(1):109–135, 1999.
  39. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
  40. Large language models on the chessboard: A study on chatgpt’s formal language comprehension and complex reasoning skills. arXiv preprint arXiv:2308.15118, 2023.
  41. Human vs. machine: Language models and wargames. arXiv preprint arXiv:2403.03407, 2024.
  42. Llm-based agent society investigation: Collaboration and confrontation in avalon gameplay. arXiv preprint arXiv:2310.14985, 2023.
  43. Theory of mind for multi-agent collaboration via large language models. arXiv preprint arXiv:2310.10701, 2023a.
  44. Large language model-empowered agents for simulating macroeconomic activities. arXiv preprint arXiv:2310.10436, 2023b.
  45. Tradinggpt: Multi-agent system with layered memory and distinct characters for enhanced financial trading performance. arXiv preprint arXiv:2309.03736, 2023c.
  46. Avalonbench: Evaluating llms playing the game of avalon. In NeurIPS 2023 Foundation Models for Decision Making Workshop, 2023.
  47. Strategic behavior of large language models: Game structure vs. contextual framing. arXiv preprint arXiv:2309.05898, 2023.
  48. Large language models play starcraft ii: Benchmarks and a chain of summarization approach. arXiv preprint arXiv:2312.11865, 2023.
  49. Alympics: Language agents meet game theory. arXiv preprint arXiv:2311.03220, 2023.
  50. A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772, 2021.
  51. Welfare diplomacy: Benchmarking language model cooperation. arXiv preprint arXiv:2310.08901, 2023.
  52. Aidan O’Gara. Hoodwinked: Deception and cooperation in a text-based game for language models. arXiv preprint arXiv:2308.01404, 2023.
  53. Training language models to follow instructions with human feedback. Advances in neural information processing systems, 35:27730–27744, 2022.
  54. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pp.  1–22, 2023.
  55. Investigating emergent goal-like behaviour in large language models using experimental economics. arXiv preprint arXiv:2305.07970, 2023.
  56. Civrealm: A learning and reasoning odyssey in civilization for decision-making agents. arXiv preprint arXiv:2401.10568, 2024.
  57. Gameeval: Evaluating llms on conversational games. arXiv preprint arXiv:2308.10032, 2023.
  58. Neural theory-of-mind? on the limits of social intelligence in large lms. arXiv preprint arXiv:2210.13312, 2022.
  59. Negotiating with llms: Prompt hacks, skill gaps, and reasoning deficits. arXiv preprint arXiv:2312.03720, 2023.
  60. Two systems for empathy: a double dissociation between emotional and cognitive empathy in inferior frontal gyrus versus ventromedial prefrontal lesions. Brain, 132(3):617–627, 2009.
  61. Swarmbrain: Embodied agent for real-time strategy game starcraft ii via large language models. arXiv preprint arXiv:2401.17749, 2024.
  62. Eric Siegel. Predictive analytics: The power to predict who will click, buy, lie, or die. John Wiley & Sons, 2013.
  63. Mastering the game of go without human knowledge. nature, 550(7676):354–359, 2017.
  64. An evolutionary model of personality traits related to cooperative behavior using a large language model. Scientific Reports, 14(1):5989, 2024.
  65. Commonsenseqa 2.0: Exposing the limits of ai through gamification. arXiv preprint arXiv:2201.05320, 2022.
  66. Medagents: Large language models as collaborators for zero-shot medical reasoning. arXiv preprint arXiv:2311.10537, 2023.
  67. Systematic biases in llm simulations of debates. arXiv preprint arXiv:2402.04049, 2024.
  68. Can large language models play text games well. Current State-of-the-Art and Open Questions, 2023.
  69. Fons JR van de Vijver and Madde E Willemsen. Abstract thinking. In Advances in psychology, volume 103, pp.  317–342. Elsevier, 1993.
  70. A logic for strategic reasoning. In Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems, pp.  157–164, 2005.
  71. A survey on large language model based autonomous agents. Frontiers of Computer Science, 18(6):1–26, 2024.
  72. Avalon’s game of thoughts: Battle against deception through recursive contemplation. arXiv preprint arXiv:2310.01320, 2023a.
  73. Unleashing the emergent cognitive synergy in large language models: A task-solving agent through multi-persona self-collaboration. arXiv preprint arXiv:2307.05300, 2023b.
  74. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837, 2022.
  75. Think twice: Perspective-taking improves large language models’ theory-of-mind capabilities. arXiv preprint arXiv:2311.10227, 2023.
  76. Deciphering digital detectives: Understanding llm behaviors and capabilities in multi-agent mystery games. arXiv preprint arXiv:2312.00746, 2023.
  77. Enhance reasoning for large language models in the game werewolf. arXiv preprint arXiv:2402.02330, 2024a.
  78. Shall we talk: Exploring spontaneous collaborations of competing llm agents. arXiv preprint arXiv:2402.12327, 2024b.
  79. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864, 2023.
  80. Measuring bargaining abilities of llms: A benchmark and a buyer-enhancement method. arXiv preprint arXiv:2402.15813, 2024.
  81. The wall street neophyte: A zero-shot analysis of chatgpt over multimodal stock movement prediction challenges. arXiv preprint arXiv:2304.05351, 2023.
  82. Urban generative intelligence (ugi): A foundational platform for agents in embodied city environment. arXiv preprint arXiv:2312.11813, 2023a.
  83. Opentom: A comprehensive benchmark for evaluating theory-of-mind reasoning capabilities of large language models. arXiv preprint arXiv:2402.06044, 2024a.
  84. Magic: Investigation of large language model powered multi-agent in cognition, adaptability, rationality and collaboration. In ICLR 2024 Workshop on Large Language Model (LLM) Agents, 2023b.
  85. A survey on game playing agents and large models: Methods, applications, and challenges. arXiv preprint arXiv:2403.10249, 2024b.
  86. Exploring large language models for communication games: An empirical study on werewolf. arXiv preprint arXiv:2309.04658, 2023c.
  87. Language agents with reinforcement learning for strategic play in the werewolf game. arXiv preprint arXiv:2310.18940, 2023d.
  88. Retroformer: Retrospective large language agents with policy gradient optimization. arXiv preprint arXiv:2308.02151, 2023.
  89. Controlling large language model-based agents for large-scale decision-making: An actor-critic approach. arXiv preprint arXiv:2311.13884, 2023.
  90. Strength lies in differences! towards effective non-collaborative dialogues via tailored strategy planning. arXiv preprint arXiv:2403.06769, 2024a.
  91. Agent-pro: Learning to evolve via policy-level reflection and optimization. arXiv preprint arXiv:2402.17574, 2024b.
  92. K-level reasoning with large language models. arXiv preprint arXiv:2402.01521, 2024c.
  93. Competeai: Understanding the competition behaviors in large language model-based agents. arXiv preprint arXiv:2310.17512, 2023.
  94. Sotopia: Interactive evaluation for social intelligence in language agents. arXiv preprint arXiv:2310.11667, 2023.
Citations (32)

Summary

  • The paper presents a comprehensive survey of LLM-driven strategic reasoning in multi-agent settings, emphasizing applications like game theory and economic simulations.
  • It details innovative methodologies such as prompt engineering, modular enhancements, and theory of mind to enhance model decision-making.
  • The evaluation framework combines quantitative metrics and qualitative analysis, paving the way for refined AI strategies in complex environments.

LLM as a Mastermind: A Survey of Strategic Reasoning with LLMs

Introduction to Strategic Reasoning with LLMs

Strategic reasoning embodies the interdisciplinary effort to endow artificial agents with the ability to deliberate and decide actions considering not only their goals but also the potential strategies of others in a multi-agent setting. LLMs, recently the focal point of AI research, have presented novel opportunities and challenges in this domain. They are leveraged for their unprecedented capacity to handle and interpret vast amounts of data, mimicking human-like reasoning processes in intricate, dynamic environments. This survey systematically explores the application of LLMs in strategic reasoning, consolidating dispersed efforts and laying the groundwork for future advancements.

Strategic reasoning distinguishes itself by requiring agents to anticipate and react to the potentially unpredictable behaviors of other entities within their environment, making this a particularly challenging aspect of AI development. Figure 1

Figure 1: Strategic reasoning with LLMs.

Defining Strategic Reasoning in the Context of LLMs

Strategic reasoning is characterized by its dynamic, interactive nature involving multiple agents. It extends beyond simple decision-making to encompass the anticipation of other agents' actions and the influence one's actions can exert on the environment. Core to this is the capacity for predictive analytics, abstract thinking, and cognitive empathy—all areas where LLMs can excel by simulating complex evaluative processes via language.

Strategic reasoning environments, termed as 'GAMES', ensure agents participate in dynamic, goal-oriented interactions (Figure 2). These settings require LLMs to not only generate responses based on learned data but to also adapt and refine strategies in real-time, considering ongoing changes and feedback from interconnected agents. Figure 2

Figure 2: Environment in strategic reasoning of multi-agent systems (MAS).

Scenarios of Application

LLMs engage in strategic reasoning across multiple scenarios:

  • Societal Simulation: Utilizing LLMs to model and predict societal behaviors involving complex interpersonal interactions, facilitating enhanced understanding of social norms and collective behavior impacts.
  • Economic Simulation: Implementing LLMs to analyze market dynamics and decision-making processes, with applications ranging from stock prediction to resource allocation, illustrating profound strategic thinking in economic frameworks.
  • Game Theory: Strategic reasoning is intrinsic to game theory, where LLMs simulate player strategies, competitive and cooperative behaviors, analyzing interactions among rational agents.
  • Gaming: From board games to electronic sports, LLMs enhance strategy development and execution, adapting to diverse and unpredictable game scenarios.

These applications underline the versatility and adaptability of LLMs in predicting and countering dynamic environmental interactions.

Enhancing Strategic Reasoning Capabilities

Improving LLMs' strategic reasoning involves several approaches:

  • Prompt Engineering: Fine-tuning input prompts to elicit optimal reasoning patterns and responses, crucial for tasks requiring high contextual understanding.
  • Modular Enhancements: Integrating specialized modules to extend LLM functionality, which aids in memory integration and utilization of domain-specific knowledge for strategic tasks.
  • Theory of Mind (ToM): Implementing ToM frameworks to allow LLMs to better predict other agents' actions based on inferred mental states, significantly enhancing decision-making accuracy.
  • Imitation and Reinforcement Learning: These approaches adapt LLM strategies through feedback mechanisms, allowing models to refine actions based on simulated or real interactive experiences. Figure 3

    Figure 3: Methods for Improving Strategic Reasoning with LLMs.

Evaluation Metrics

Assessing the performance of LLMs in strategic reasoning involves:

  • Quantitative Metrics: Win rates and reward-based evaluations provide concrete measures of strategic efficacy in controlled environments.
  • Qualitative Analysis: Evaluates aspects such as cooperation, deception, and adaptiveness, offering insights into LLM capabilities beyond numerical outcomes.

Together, these metrics ensure a comprehensive evaluation of LLMs' abilities in understanding and navigating complex multi-agent environments.

Future Challenges and Opportunities

While LLMs have demonstrated exceptional potential in simulating strategic reasoning, challenges remain, particularly in terms of consistently replicating human-like reasoning under diverse conditions. The development of unified benchmarks for evaluation and the fusion of multiple methodologies could propel advancements, enhancing the models' ability to handle complex, dynamic interactions effectively.

Further interdisciplinary research should explore the integration of cognitive theories and AI, leveraging each to develop more refined and intelligent agents capable of deftly navigating the intricate webs of strategic interaction present in real and simulated environments.

Conclusion

The integration of LLMs in strategic reasoning represents a significant stride in AI research, unlocking new capabilities and applications in complex multi-agent environments. Continued refinement and evaluation will crucially enhance these models' roles as intelligent and adaptable agents in diverse strategic scenarios, promising vast potential for future applications in AI-driven decision-making and strategy development.

Whiteboard

Video Overview

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 4 tweets with 8 likes about this paper.