The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey (2404.11584v1)
Abstract: This survey paper examines the recent advancements in AI agent implementations, with a focus on their ability to achieve complex goals that require enhanced reasoning, planning, and tool execution capabilities. The primary objectives of this work are to a) communicate the current capabilities and limitations of existing AI agent implementations, b) share insights gained from our observations of these systems in action, and c) suggest important considerations for future developments in AI agent design. We achieve this by providing overviews of single-agent and multi-agent architectures, identifying key patterns and divergences in design choices, and evaluating their overall impact on accomplishing a provided goal. Our contribution outlines key themes when selecting an agentic architecture, the impact of leadership on agent systems, agent communication styles, and key phases for planning, execution, and reflection that enable robust AI agent systems.
- “AutoGPT+P: Affordance-based Task Planning with Large Language Models” arXiv:2402.10778 [cs] version: 1 arXiv, 2024 URL: http://arxiv.org/abs/2402.10778
- “AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors” arXiv:2308.10848 [cs] arXiv, 2023 URL: http://arxiv.org/abs/2308.10848
- “Training Verifiers to Solve Math Word Problems” arXiv:2110.14168 [cs] arXiv, 2021 URL: http://arxiv.org/abs/2110.14168
- “Large Language Model-based Human-Agent Collaboration for Complex Task Solving”, 2024 arXiv:2402.12914 [cs.CL]
- “Bias and Fairness in Large Language Models: A Survey” arXiv:2309.00770 [cs] arXiv, 2024 URL: http://arxiv.org/abs/2309.00770
- “Efficient Tool Use with Chain-of-Abstraction Reasoning” arXiv:2401.17464 [cs] arXiv, 2024 URL: http://arxiv.org/abs/2401.17464
- “Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies” arXiv:2101.02235 [cs] arXiv, 2021 URL: http://arxiv.org/abs/2101.02235
- “Time Travel in LLMs: Tracing Data Contamination in Large Language Models” arXiv:2308.08493 [cs] version: 3 arXiv, 2024 URL: http://arxiv.org/abs/2308.08493
- “Embodied LLM Agents Learn to Cooperate in Organized Teams”, 2024 arXiv:2403.12482 [cs.AI]
- “Measuring Massive Multitask Language Understanding” arXiv:2009.03300 [cs] arXiv, 2021 URL: http://arxiv.org/abs/2009.03300
- “MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework”, 2023 arXiv:2308.00352 [cs.AI]
- “Understanding the planning of LLM agents: A survey”, 2024 arXiv:2402.02716 [cs.AI]
- “SWE-bench: Can Language Models Resolve Real-World GitHub Issues?” arXiv:2310.06770 [cs] arXiv, 2023 URL: http://arxiv.org/abs/2310.06770
- “S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models” arXiv:2310.15147 [cs] arXiv, 2023 URL: http://arxiv.org/abs/2310.15147
- “Graph-enhanced Large Language Models in Asynchronous Plan Reasoning” arXiv:2402.02805 [cs] arXiv, 2024 URL: http://arxiv.org/abs/2402.02805
- “From LLM to Conversational Agent: A Memory Enhanced Architecture with Fine-Tuning of Large Language Models” arXiv:2401.02777 [cs] arXiv, 2024 URL: http://arxiv.org/abs/2401.02777
- “AgentBench: Evaluating LLMs as Agents” arXiv:2308.03688 [cs] arXiv, 2023 URL: http://arxiv.org/abs/2308.03688
- “Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization”, 2023 arXiv:2310.02170 [cs.CL]
- Yohei Nakajima “yoheinakajima/babyagi” original-date: 2023-04-03T00:40:27Z, 2024 URL: https://github.com/yoheinakajima/babyagi
- “AI Deception: A Survey of Examples, Risks, and Potential Solutions” arXiv:2308.14752 [cs] arXiv, 2023 URL: http://arxiv.org/abs/2308.14752
- “Personality Traits in Large Language Models”, 2023 arXiv:2307.00184 [cs.CL]
- “Learning to Use Tools via Cooperative and Interactive Agents” arXiv:2403.03031 [cs] arXiv, 2024 URL: http://arxiv.org/abs/2403.03031
- “Reflexion: Language Agents with Verbal Reinforcement Learning” arXiv:2303.11366 [cs] arXiv, 2023 URL: http://arxiv.org/abs/2303.11366
- “Systematic Biases in LLM Simulations of Debates” arXiv:2402.04049 [cs] arXiv, 2024 URL: http://arxiv.org/abs/2402.04049
- “Evil Geniuses: Delving into the Safety of LLM-based Agents” arXiv:2311.11855 [cs] arXiv, 2024 URL: http://arxiv.org/abs/2311.11855
- “Rethinking the Bounds of LLM Reasoning: Are Multi-Agent Discussions the Key?” arXiv:2402.18272 [cs] arXiv, 2024 URL: http://arxiv.org/abs/2402.18272
- “Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation” arXiv:2402.11443 [cs] arXiv, 2024 URL: http://arxiv.org/abs/2402.11443
- “Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration”, 2024 arXiv:2307.05300 [cs.AI]
- “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models” arXiv:2201.11903 [cs] arXiv, 2023 URL: http://arxiv.org/abs/2201.11903
- “SmartPlay: A Benchmark for LLMs as Intelligent Agents” arXiv:2310.01557 [cs] arXiv, 2024 URL: http://arxiv.org/abs/2310.01557
- “The Rise and Potential of Large Language Model Based Agents: A Survey”, 2023 arXiv:2309.07864 [cs.AI]
- “ReAct: Synergizing Reasoning and Acting in Language Models” arXiv:2210.03629 [cs] arXiv, 2023 URL: http://arxiv.org/abs/2210.03629
- “Tree of Thoughts: Deliberate Problem Solving with Large Language Models” arXiv:2305.10601 [cs] arXiv, 2023 URL: http://arxiv.org/abs/2305.10601
- “How Language Model Hallucinations Can Snowball” arXiv:2305.13534 [cs] arXiv, 2023 URL: http://arxiv.org/abs/2305.13534
- “(InThe)WildChat: 570K ChatGPT Interaction Logs In The Wild” In The Twelfth International Conference on Learning Representations, 2024 URL: https://openreview.net/forum?id=Bl8u7ZRlbM
- “Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models” arXiv:2310.04406 [cs] arXiv, 2023 URL: http://arxiv.org/abs/2310.04406
- “DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents” arXiv:2402.14865 [cs] arXiv, 2024 URL: http://arxiv.org/abs/2402.14865
- “DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks” arXiv:2309.17167 [cs] arXiv, 2024 URL: http://arxiv.org/abs/2309.17167