Emergent Mind

A Survey on Large Language Model-Based Game Agents

(2404.02039)
Published Apr 2, 2024 in cs.AI

Abstract

The development of game agents holds a critical role in advancing towards AGI. The progress of LLMs and their multimodal counterparts (MLLMs) offers an unprecedented opportunity to evolve and empower game agents with human-like decision-making capabilities in complex computer game environments. This paper provides a comprehensive overview of LLM-based game agents from a holistic viewpoint. First, we introduce the conceptual architecture of LLM-based game agents, centered around six essential functional components: perception, memory, thinking, role-playing, action, and learning. Second, we survey existing representative LLM-based game agents documented in the literature with respect to methodologies and adaptation agility across six genres of games, including adventure, communication, competition, cooperation, simulation, and crafting & exploration games. Finally, we present an outlook of future research and development directions in this burgeoning field. A curated list of relevant papers is maintained and made accessible at: https://github.com/git-disl/awesome-LLM-game-agent-papers.

Architecture of LLMGAs for strategy formulation in games using multimodal information and role-playing for decision making.

Overview

  • The paper provides an in-depth analysis of Large Language Model-based Game Agents (LLMGAs), emphasizing their critical contribution to AGI through games.

  • It explores the six core components of LLMGAs: perception, memory, thinking, role-playing, action, and learning, which collectively enable these agents to mimic human cognitive processes.

  • Different game genres are categorized and examined, assessing LLMGAs' performance and outlining specific strategies employed in adventure, communication, competition, cooperation, simulation, and crafting and exploration games.

  • The paper discusses evaluation methods for LLMGAs and points towards future research directions, including deeper integration of LLMs into game environments and advancing simulations of agent societies.

Advances and Prospects in Large Language Model-based Game Agents

Introduction to LLM-based Game Agents

The role of game agents in the advancement of AGI is undeniably crucial, with the rapid evolution of LLMs and their multimodal counterparts offering new pathways to enhance these agents' capabilities. This paper meticulously surveys the progress of LLM-based game agents (LLMGAs), defining their architecture and delving into their application across various game genres. By organizing the discussion into six core components—perception, memory, thinking, role-playing, action, and learning—the paper lays down a comprehensive framework for current methodologies and future research directions.

Core Components of LLMGAs

LLMGAs are distinguished by six functional components designed to mimic human cognitive processes:

  • Perception: LLMGAs leverage text and visual input to perceive game environments, employing methods like state variable access, external visual encoders, and multimodal LLMs for accurate perception.
  • Memory: Serving as an external storage, memory in LLMGAs holds past experiences and knowledge, crucial for strategy formulation.
  • Thinking: Incorporating reasoning and planning, this component analyses information, infers explanations, and strategizes actions based on game states.
  • Role-playing: By simulating specific roles, LLMGAs can generate believable behaviors and dialogues aligned with their characters within games.
  • Action: This translates decisions into executable game actions, with different strategies for manipulating game elements effectively.
  • Learning: LLMGAs improve over time through interactions with the game environment, employing methods like in-context feedback learning, supervised fine-tuning, and reinforcement learning.

Categorization of Games

The paper organizes games into six categories, assessing LLMGAs' performance and strategies within each:

  • Adventure Games: Exploring narrative-driven environments, where agents address complex quests through story interaction.
  • Communication Games: Focused on negotiation and deception to infer and conceal intentions among players.
  • Competition Games: Highlighting skill or strategy to outperform opponents, requiring advanced reasoning.
  • Cooperation Games: Where teamwork and collaborative problem-solving are key to achieving common goals.
  • Simulation Games: Simulating real-world scenarios, demanding realistic interaction and decision-making.
  • Crafting and Exploration Games: Encouraging creativity in resource gathering and item crafting within expansive environments.

Evaluation Metrics

Evaluating LLMGAs involves varied metrics across game types, including task success rate for adventure and simulation games, win rates for competition and communication games, and human evaluation for simulations of social interactions.

Implications and Future Directions

The advancement of LLMGAs reflects significant progress toward AGI, particularly in understanding and interacting with complex environments. Future efforts could focus on grounding LLMs in game environments for deeper contextual understanding, enabling knowledge discovery beyond current gameplay to unearth fundamental principles through experience, and enhancing agent society simulations for a closer representation of human social interactions.

Conclusion

This comprehensive examination of LLM-based game agents underscores their pivotal role in pushing the boundaries of AI research. By detailing the architecture, application, and evaluation of LLMGAs across a broad spectrum of game genres, the paper sets the stage for future advances that promise to bring us closer to realizing human-like AGI. The curated list of papers and ongoing updates serve as a valuable resource for researchers in exploring the burgeoning field of game-based AI research.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

YouTube