Improving Multi-Agent Debate with Sparse Communication Topology (2406.11776v1)

Published 17 Jun 2024 in cs.CL

Abstract: Multi-agent debate has proven effective in improving LLMs quality for reasoning and factuality tasks. While various role-playing strategies in multi-agent debates have been explored, in terms of the communication among agents, existing approaches adopt a brute force algorithm -- each agent can communicate with all other agents. In this paper, we systematically investigate the effect of communication connectivity in multi-agent systems. Our experiments on GPT and Mistral models reveal that multi-agent debates leveraging sparse communication topology can achieve comparable or superior performance while significantly reducing computational costs. Furthermore, we extend the multi-agent debate framework to multimodal reasoning and alignment labeling tasks, showcasing its broad applicability and effectiveness. Our findings underscore the importance of communication connectivity on enhancing the efficiency and effectiveness of the "society of minds" approach.

Citations (5)

View on Semantic Scholar

Summary

The paper demonstrates that sparse communication in MAD achieves up to +7.5% accuracy improvement and reduces token costs by over 40% compared to fully-connected setups.
It extends the MAD framework to multimodal reasoning and alignment labeling tasks, enhancing both helpfulness and harmlessness in performance.
Efficient resource utilization is achieved by assigning higher-centrality roles to stronger LLMs, enabling weaker models to improve through guided debate rounds.

Improving Multi-Agent Debate with Sparse Communication Topology

The paper "Improving Multi-Agent Debate with Sparse Communication Topology" by Peter Grabowski, Yeqing Li, and Eugene Ie investigates the optimization of communication topology in multi-agent debate (MAD) frameworks for enhancing the efficiency and effectiveness of LLMs and multimodal LLMs (MLLMs).

Abstract Overview

The authors explore how the connectivity among agents in a MAD framework affects performance and computational cost. Existing approaches adopt a fully-connected topology where each agent communicates with every other agent, leading to significant computational expenses. This paper systematically examines sparse communication topology and demonstrates that it can achieve comparable or superior performance with reduced computational costs. The findings are validated on reasoning tasks using GPT and Mistral models and extended to multimodal reasoning and alignment labeling tasks.

Key Contributions

Sparse Communication Topology:
- Sparse communication topology in MAD can deliver similar or even better results compared to fully-connected MAD, with significantly lower inference costs. Notably, a neighbor-connected MAD improved accuracy by +2% on the MATH dataset and maintained the same accuracy on GSM8K, while reducing input token costs by over 40%.
Extension to Multimodal Reasoning and Alignment Labeling:
- The paper extends the MAD framework to multimodal reasoning and alignment labeling tasks, demonstrating its versatility and broad applicability. Sparse MAD achieves substantial improvements in helpfulness and harmlessness while reducing computational costs.
Efficiency Insights:
- Insights reveal that sparse communication allows for effective debates across more rounds before consensus, enhancing discussion depth and critical thinking, leading to improved outcomes. The paper also shows how weaker models can be strengthened by interaction with stronger models within the MAD framework.

Experimental Setup and Results

Text-Only Reasoning Tasks

The experiments focus on text-only reasoning tasks using the MATH and GSM8K datasets with GPT-3.5 models. Sparse MAD variants, with varying degrees of connectivity, were compared to fully-connected MAD:

MATH: Sparse MAD ( $D=2/5$ ) yielded a +7.5% improvement with a 41.5% reduction in token costs compared to fully-connected MAD.
GSM8K: Sparse MAD ( $D=3/5$ ) achieved a +6.5% boost in accuracy with cost savings of 43.5%.

Multimodal Reasoning Tasks

Using the MathVista dataset and GPT-4, the performances of sparse MAD were compared:

Sparse MAD configurations achieved up to +1.2% in accuracy improvement over fully-connected MAD, with reduced token costs by up to 33.1%.

Alignment Labeling Tasks

Aligned assessments used the Anthropic-HH dataset to validate effectiveness on alignment tasks:

Helpfulness Task: Sparse MAD ( $D=2/5$ ) improved accuracy by +0.5% with cost savings up to 50%.
Harmlessness Task: Similar improvements were observed with the GPT-3.5 and Mistral 7B models.

Theoretical Insights and Design Strategies

The paper postulates that MAD is not merely dependent on the amount of information shared but also on the quality and diversity of this information. Sparse connectivity diversifies the input, potentially mitigating cascading misinformation, which occurs in fully-connected settings where erroneous information can circulate unchecked. Analyzing rounds before consensus, the research found that sparse topologies extend the number of effective debate rounds, fostering more thorough deliberation among agents.

Multiple LLM Setup

The paper explores asymmetric node roles in MAD by introducing multiple LLMs with differing strengths. Assigning stronger LLMs to higher-centrality nodes optimizes the problem-solving prowess of the network. Experiments on the harmlessness alignment task confirmed that higher centrality assignments of better-performing LLMs significantly enhance overall performance by expediting dissemination of superior solution strategies across weaker agents.

Implications and Future Directions

The implications for AI and LLM research are considerable. Efficient resource utilization while maintaining or enhancing performance is critical for scalable, sustainable AI deployments. The insights gained offer practical pathways for deploying robust multi-agent systems with constrained computational budgets. Future research directions could further refine dynamic graph topologies and investigate hybrid approaches combining deterministic and probabilistic connectivity frameworks to exploit the advantages of both worlds.

In conclusion, the paper presents a compelling case for adopting sparse communication topologies within multi-agent debate frameworks, demonstrating that this approach can effectively balance performance and computational efficiency. This work sets the stage for further exploration into adaptive, scalable communication strategies in multi-agent reinforcement learning and AI alignment applications.

PDF Markdown

Related Papers

Tweets

https://twitter.com/xamat/status/1805405916713361732