LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models (2310.03903v3)

Published 5 Oct 2023 in cs.CL and cs.MA

Abstract: LLMs have demonstrated emergent common-sense reasoning and Theory of Mind (ToM) capabilities, making them promising candidates for developing coordination agents. This study introduces the LLM-Coordination Benchmark, a novel benchmark for analyzing LLMs in the context of Pure Coordination Settings, where agents must cooperate to maximize gains. Our benchmark evaluates LLMs through two distinct tasks. The first is Agentic Coordination, where LLMs act as proactive participants in four pure coordination games. The second is Coordination Question Answering (CoordQA), which tests LLMs on 198 multiple-choice questions across these games to evaluate three key abilities: Environment Comprehension, ToM Reasoning, and Joint Planning. Results from Agentic Coordination experiments reveal that LLM-Agents excel in multi-agent coordination settings where decision-making primarily relies on environmental variables but face challenges in scenarios requiring active consideration of partners' beliefs and intentions. The CoordQA experiments further highlight significant room for improvement in LLMs' Theory of Mind reasoning and joint planning capabilities. Zero-Shot Coordination (ZSC) experiments in the Agentic Coordination setting demonstrate that LLM agents, unlike RL methods, exhibit robustness to unseen partners. These findings indicate the potential of LLMs as Agents in pure coordination setups and underscore areas for improvement. Code Available at https://github.com/eric-ai-lab/LLM_coordination.

References (30)

Citations (5)

View on Semantic Scholar

Summary

The paper introduces the LLM-Co framework that enables GPT-4 to achieve near-human Theory of Mind in multi-agent coordination games.
It translates game rules into textual formats to support robust communication and sustained coordination without additional fine-tuning.
The study evaluates performance across diverse game environments, showcasing adaptive partner alignment and explicit assistance capabilities.

Insights into Multi-Agent Coordination with LLMs

This paper investigates the potential of LLMs in facilitating multi-agent coordination, a critical component of collaborative artificial intelligence applications. The authors present an LLM-Coordinated Framework (LLM-Co) as a method for enabling LLMs to engage effectively in coordination games. They explore five pertinent aspects of coordination: Theory of Mind (ToM), Situated Reasoning, Sustained Coordination, Robustness to Partners, and Explicit Assistance. This paper evaluates LLMs in various game environments, highlighting their strengths and limitations.

LLM-Coordination Framework

The LLM-Co Framework is designed to enable LLMs, like GPT-4, to interact and perform tasks in dynamic multi-agent game environments. It provides a structured approach by translating game details and rules into a textual format that LLMs can process effectively. The framework supports continuous gameplay across environments by helping LLMs infer actionable steps based on the current game state and the feasible actions available.

Game Environments and Evaluations

The evaluations were conducted in three game environments: Collab Escape, Collab Capture, and Overcooked-AI. Each environment presents unique challenges requiring agents to display Theory of Mind, sustained coordination over extended tasks, and the ability to assist explicitly.

Theory of Mind and Situated Reasoning: The paper introduced an LLM-ToM-Reasoning Test Set to measure the ToM and situated reasoning capabilities of LLMs. It was observed that GPT-4 outperforms others, approaching near-human reasoning levels, demonstrating its capacity to accurately predict partner intentions.
Sustained Coordination: The LLM-Co agents, particularly those using GPT-4, were capable of sustained coordination, outperforming existing RL-based methods in coordination-heavy tasks without pre-training or task-specific fine-tuning.
Robustness to Partners: LLM-Co agents were evaluated against varied partner types, including RL baselines trained with human data. Results show that they adaptively align with partner behavior without compromising coordination efficiency.
Explicit Assistance: The paper explored scenarios requiring proactive help to enhance joint task completion effectiveness. They introduced specific Overcooked-AI layouts that require explicit assistance, demonstrating the adaptability of LLM-Co agents to these requirements with appropriate directive prompts.

Implications and Future Developments

The positive outcomes from this research indicate a promising direction for using LLMs in collaborative AI agents. They can process complex instructions, adapt to unforeseen partner actions, and execute long-term plans, making them suitable for real-world multi-agent tasks. Future work will likely explore the scalability of such frameworks across diversified agents and environments, potentially integrating real-world variables and constraints.

Conclusion

This paper underscores the utility of LLMs, principally GPT-4, in multi-agent coordination. By developing structured frameworks like LLM-Co and evaluating them against comprehensive scenarios, the research highlights the emergent reasoning capabilities of LLMs in collaborative tasks. These findings lay the groundwork for LLMs to serve as reliable agents in both virtual and real-world applications requiring sophisticated coordination.

PDF Markdown

Related Papers

Tweets

https://twitter.com/IntuitMachine/status/1781330498372923574

https://twitter.com/xwang_lk/status/1860144529128718431

https://twitter.com/xwang_lk/status/1780805397928198624

https://twitter.com/thesidsrikanth/status/1923181821413687481

https://twitter.com/gm8xx8/status/1775704168075780375

https://twitter.com/xwang_lk/status/1918046788906565821