A Survey on Large Language Model based Autonomous Agents (2308.11432v7)

Published 22 Aug 2023 in cs.AI and cs.CL

Abstract: Autonomous agents have long been a prominent research focus in both academic and industry communities. Previous research in this field often focuses on training agents with limited knowledge within isolated environments, which diverges significantly from human learning processes, and thus makes the agents hard to achieve human-like decisions. Recently, through the acquisition of vast amounts of web knowledge, LLMs have demonstrated remarkable potential in achieving human-level intelligence. This has sparked an upsurge in studies investigating LLM-based autonomous agents. In this paper, we present a comprehensive survey of these studies, delivering a systematic review of the field of LLM-based autonomous agents from a holistic perspective. More specifically, we first discuss the construction of LLM-based autonomous agents, for which we propose a unified framework that encompasses a majority of the previous work. Then, we present a comprehensive overview of the diverse applications of LLM-based autonomous agents in the fields of social science, natural science, and engineering. Finally, we delve into the evaluation strategies commonly used for LLM-based autonomous agents. Based on the previous studies, we also present several challenges and future directions in this field. To keep track of this field and continuously update our survey, we maintain a repository of relevant references at https://github.com/Paitesanshi/LLM-Agent-Survey.

Citations (760)

View on Semantic Scholar

Summary

The paper provides a comprehensive review of LLM-based autonomous agents, focusing on architecture design, applications, and evaluation methods.
Methodologies include the integration of profiling, memory management, planning, and action modules to effectively structure and perform tasks.
The study evaluates agents using both subjective and objective metrics while addressing challenges such as role-playing accuracy, human alignment, prompt robustness, and hallucination.

"A Survey on LLM based Autonomous Agents" (2308.11432)

In recent years, the field of artificial intelligence has seen a significant shift towards leveraging LLMs to develop autonomous agents that aim to mimic human-level intelligence. The paper "A Survey on LLM based Autonomous Agents" provides a thorough review of the LLM-based autonomous agents, examining how they are constructed, applied, and evaluated across various domains.

Construction of LLM-based Autonomous Agents

The construction of LLM-based autonomous agents revolves around designing a robust architecture and effective capability acquisition strategies. A unified framework is proposed, encapsulating critical components such as:

Profiling Module: Determines the agent's role and guides its behaviors based on predefined profiles, using methods like handcrafting, LLM-generation, and dataset alignment.
Memory Module: Manages short- and long-term memories to facilitate reasoning and decision-making, with strategies for reading, writing, and reflecting on memories.
Planning Module: Empowers agents with the ability to decompose tasks and devise plans, with approaches that include both single-path and multi-path reasoning.
Action Module: Translates decisions into actions by interfacing with internal knowledge and external tools.

Capability Acquisition for these agents can involve fine-tuning LLMs using task-specific datasets or utilizing prompt engineering and mechanism engineering to enhance innate model capabilities without adjusting model parameters.

Figure 1: A unified framework for the architecture design of LLM-based autonomous agent.

Applications

LLM-based autonomous agents find applications across various sectors such as:

Social Science: Simulating human behaviors for psychological studies, political science analysis, and social network simulations.
Natural Science: Assisting in data management, experiment planning, and science education by automating complex processes and simulating experimental setups.
Engineering: Enhancing industrial automation, software development, and robotics through the integration of reasoning and planning capabilities in dynamic environments.
Figure 2: The applications (left) and evaluation strategies (right) of LLM-based agents.

Evaluation of LLM-based Autonomous Agents

Evaluating the performance of LLM-based agents involves both subjective and objective methods:

Subjective Evaluation: Relies on human judgments for assessing agent behaviors and outcomes. This includes human annotation and Turing Test methodologies.
Objective Evaluation: Utilizes metrics such as task success rates, social evaluations, and benchmarks in simulated environments to provide quantitative assessments of agent capabilities.
Figure 3: Illustration of transitions in strategies for acquiring model capabilities.

Challenges and Future Directions

Despite their potential, LLM-based autonomous agents face significant challenges, including:

Role-playing Capability: Ensuring agents can accurately simulate diverse roles remains difficult due to LLMs' extensive knowledge base and limitations in modeling human psychology.
Generalized Human Alignment: Balancing the simulation of authentic human behavior with moral and ethical considerations is crucial.
Prompt Robustness: Designing stable prompts that maintain consistency across different LLMs and applications requires further exploration.
Hallucination: Addressing the generation of false information, especially when agents interact in critical applications, is imperative.
Efficiency: Enhancing the inference speed of LLMs to improve the overall efficiency of agent actions.

Conclusion

The paper offers a comprehensive overview of the recent advancements in LLM-based autonomous agents, highlighting their potential, applications, and the challenges that lie ahead. By addressing these challenges, future research can unlock transformative capabilities in LLM agents, enabling them to perform a broader range of tasks with greater accuracy and reliability.