Emergent Mind

Cognitive Architectures for Language Agents

(2309.02427)
Published Sep 5, 2023 in cs.AI , cs.CL , cs.LG , and cs.SC

Abstract

Recent efforts have augmented LLMs with external resources (e.g., the Internet) or internal control flows (e.g., prompt chaining) for tasks requiring grounding or reasoning, leading to a new class of language agents. While these agents have achieved substantial empirical success, we lack a systematic framework to organize existing agents and plan future developments. In this paper, we draw on the rich history of cognitive science and symbolic artificial intelligence to propose Cognitive Architectures for Language Agents (CoALA). CoALA describes a language agent with modular memory components, a structured action space to interact with internal memory and external environments, and a generalized decision-making process to choose actions. We use CoALA to retrospectively survey and organize a large body of recent work, and prospectively identify actionable directions towards more capable agents. Taken together, CoALA contextualizes today's language agents within the broader history of AI and outlines a path towards language-based general intelligence.

Overview

  • The paper introduces CoALA, a framework for developing language agents grounded in cognitive science and symbolic AI principles.

  • CoALA emphasizes a structured integration of LLMs with a focus on memory modularity, action space organization, and decision-making.

  • Memory in language agents is broken down into working, episodic, semantic, and procedural types, with a unique emphasis on actionable memory.

  • The authors outline an explicit action space for agents, distinguishing between internal cognitive actions and external grounding actions.

  • The decision-making cycle of agents is detailed as a continuous loop of proposing, evaluating, and selecting actions, which is essential for higher-level cognition.

Introduction

In "Cognitive Architectures for Language Agents," Sumers et al. present the Cognitive Architectures for Language Agents (CoALA), a structured blueprint aimed at systematizing the development of language agents. This framework is anchored in the history of cognitive science and AI, incorporating principles from symbolic AI to refine the current approach to building language agents. CoALA provides a coherent structure for integrating LLMs into agents, emphasizing modular memory components, an organized action space, and generalized decision-making procedures.

Memory and Action Space

The authors dissect the anatomy of language agents across three critical dimensions—memory, action space, and decision-making process. They highlight the division of information storage into working and long-term memories, where the former is mapped to variables for immediate access, while the latter is subdivided into episodic, semantic, and procedural memories. The concept of actionable memory, where agents can read and write to memory modules, is contrasted sharply with language models that are primarily stateless.

Agents perform actions within an explicitly defined space that segregates internal actions such as reasoning, updating working memory, and learning from external grounding actions. Grounding actions interface with the external setting, be it physical, developmental, or digitized environments. The interconnectedness of actions and memory recall—often a richer dimension neglected in traditional retrieval-augmented models—is accentuated as a key feature of adept agents.

Decision-Making Cycle

The decision-making procedure is elucidated as a cycle where an agent proposes, evaluates, and selects actions to execute. This loop reflects a program's main loop, continuously interacting with input and cycling through available actions. Sumers et al. argue that this planning and execution process is critical for higher-level agent cognition, enabling actions that affect both the external world and the agent's long-term memories.

Future Directions

The paper situates CoALA within the broader trajectory of AI research, proposing actionable directions for future work. Anticipated advancements include considerations for designing agent frameworks in both research and industry, applying structured reasoning beyond prompt engineering, contemplating the depth and breadth of long-term memory and learning capabilities, and taking a structured approach to the action space beyond typical tool-assisted operations. As LLMs evolve, so too may the frameworks like CoALA, adapting to balance the capabilities of artificial agents against the complexities of decision-making in any given context.

CoALA initiates a conversation about the transformation from language models to full-fledged cognitive agents, moving towards overarching goals of general intelligence. Crucially, the iterative development process of CoALA points towards synergistic research opportunities between language models and cognitive architectures, perhaps contributing to the next leap in AI sophistication.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.