Emergent Mind

Abstract

The rapid advancement of LLMs has paved the way for the development of highly capable autonomous agents. However, existing multi-agent frameworks often struggle with integrating diverse capable third-party agents due to reliance on agents defined within their own ecosystems. They also face challenges in simulating distributed environments, as most frameworks are limited to single-device setups. Furthermore, these frameworks often rely on hard-coded communication pipelines, limiting their adaptability to dynamic task requirements. Inspired by the concept of the Internet, we propose the Internet of Agents (IoA), a novel framework that addresses these limitations by providing a flexible and scalable platform for LLM-based multi-agent collaboration. IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control. Through extensive experiments on general assistant tasks, embodied AI tasks, and retrieval-augmented generation benchmarks, we demonstrate that IoA consistently outperforms state-of-the-art baselines, showcasing its ability to facilitate effective collaboration among heterogeneous agents. IoA represents a step towards linking diverse agents in an Internet-like environment, where agents can seamlessly collaborate to achieve greater intelligence and capabilities. Our codebase has been released at \url{https://github.com/OpenBMB/IoA}.

Conceptual layered architecture design of the Internet of Anything (IoA).

Overview

  • The paper introduces the Internet of Agents (IoA) framework to enhance collaboration and integration of heterogeneous autonomous agents enabled by LLMs.

  • IoA's architecture features dynamic agent registration, discovery, and hierarchical team formation, and incorporates a flexible state mechanism for controlling conversation flow and task execution.

  • Experimental results show IoA consistently outperforms state-of-the-art baselines across various benchmarks, demonstrating its effectiveness and scalability for complex, distributed multi-agent collaborations.

An Overview of "Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence"

The paper "Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence" presents a comprehensive framework, termed Internet of Agents (IoA), aimed at advancing the integration and collaboration of autonomous agents. These agents are facilitated by LLMs, addressing key limitations found in existing multi-agent systems. The framework emphasizes on flexible integration of diverse third-party agents, enabling distributed multi-agent collaboration, and introduces dynamic mechanisms for agent teaming and conversation flow control.

Problem Statement

The rapid development of LLMs has yielded highly capable autonomous agents that perform near-human tasks. Yet, multi-agent frameworks have not kept pace, constrained by several limitations:

  • Ecosystem Isolation: Current frameworks restrict to agents defined within their own ecosystems, impeding the integration of varied third-party agents.
  • Single-Device Simulation: Existing systems typically operate on single devices, which doesn't reflect the real-world distributed environments.
  • Rigid Communication and Coordination: Predominantly hard-coded communication pipelines limit adaptability to dynamic task requirements.

IoA Framework and Key Mechanisms

Framework Overview

Inspired by the design and success of the Internet in human collaboration, IoA introduces a flexible and scalable platform for LLM-based multi-agent collaboration. The framework incorporates:

  • Agent Integration Protocol: Allows seamless integration of diverse third-party agents.
  • Instant Messaging-Like Architecture: Facilitates dynamic agent discovery and teaming.
  • Dynamic Conversation Flow Control: Uses a flexible state mechanism for agent communication and sub-task execution.

Detailed Architecture

The IoA framework is structured into three layers:

  1. Foundation Layer: Ensures essential infrastructure for agent integration, data management, and network communication.
  2. Data Layer: Manages information related to agents, group chats, and tasks.
  3. Interaction Layer: Facilitates team formation and agent communication.

This layered architecture supports scalable, adaptive multi-agent collaboration, allowing agents distributed across multiple devices and locations to cooperate effectively.

Key Mechanisms

Agent Registration and Discovery

IoA employs an agent registration and discovery mechanism that integrates third-party agents, enabling them to be discovered and recruited dynamically based on task requirements.

Autonomous Nested Team Formation

This mechanism allows dynamic and hierarchical team formation where teams and nested sub-teams are composed adaptively based on evolving task requirements.

Autonomous Conversation Flow Control

Inspired by Speech Act Theory, this mechanism employs a finite state machine to manage conversation states, enabling structured communication and efficient task execution among agents.

Experimental Validation

IoA's effectiveness is demonstrated through extensive experiments on various benchmarks encompassing general assistant tasks, embodied AI tasks, and retrieval-augmented generation tasks. The results showcased IoA's ability to consistently outperform state-of-the-art (SoTA) baselines.

  • GAIA Benchmark: IoA integrates four ReAct agents and achieves higher overall performance compared to SoTA approaches, demonstrating its capacity to manage heterogeneous tools.
  • Open-Ended Instruction Benchmark: IoA integrates AutoGPT and Open Interpreter, showing significant performance advantages across diverse task categories.
  • Embodied AI Tasks (RoCoBench): IoA's performance surpasses a specialized multi-agent framework, indicating its effectiveness in contexts requiring collaboration among agents with different observation and action capabilities.
  • Retrieval-Augmented Generation Tasks: IoA, even with a GPT-3.5 core, performs on par with or better than GPT-4, illustrating its efficacy in handling heterogeneous knowledge sources.

Implications and Future Directions

IoA represents a significant advancement towards creating versatile and capable multi-agent systems. The ability to integrate diverse third-party agents and support distributed environments positions IoA as a pivotal platform for future multi-agent collaboration.

Practical Implications:

  • Enhances adaptability and flexibility in multi-agent collaborations across various domains.
  • Demonstrates scalability and efficiency in managing complex, real-world tasks requiring diverse toolsets and capabilities.

Theoretical Implications:

  • Offers a robust framework for exploring new dynamics in multi-agent collaboration.
  • Provides a basis for developing more sophisticated and adaptable multi-agent systems.

Future Speculations:

  • As LLMs evolve and become more efficient, deploying agents on personal computers or even mobile devices will become increasingly feasible.
  • Future research could explore optimizing LLM alignment for efficient agent communication, further reducing operational costs and improving multi-agent task execution.

In conclusion, IoA bridges critical gaps in current multi-agent frameworks, offering a forward-looking platform for integrating and orchestrating heterogeneous autonomous agents in collaborative intelligence. This work opens up new research avenues and practical applications, pushing the boundaries of what autonomous agents can achieve.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

GitHub

GitHub - OpenBMB/IoA (545 stars)

YouTube
HackerNews