Emergent Mind

MARG: Multi-Agent Review Generation for Scientific Papers

(2401.04259)
Published Jan 8, 2024 in cs.CL

Abstract

We study the ability of LLMs to generate feedback for scientific papers and develop MARG, a feedback generation approach using multiple LLM instances that engage in internal discussion. By distributing paper text across agents, MARG can consume the full text of papers beyond the input length limitations of the base LLM, and by specializing agents and incorporating sub-tasks tailored to different comment types (experiments, clarity, impact) it improves the helpfulness and specificity of feedback. In a user study, baseline methods using GPT-4 were rated as producing generic or very generic comments more than half the time, and only 1.7 comments per paper were rated as good overall in the best baseline. Our system substantially improves the ability of GPT-4 to generate specific and helpful feedback, reducing the rate of generic comments from 60% to 29% and generating 3.7 good comments per paper (a 2.2x improvement).

MARG-S framework for multimodal emotion recognition, employing audio, visual, and textual data analysis.

Overview

  • MARG-S is a novel approach that uses multiple instances of a large language model to generate peer reviews for scientific papers, improving feedback quality.

  • It utilizes a leader agent and multiple worker and expert agents that specialize in different aspects of critique, working together to handle lengthy texts.

  • A communication protocol is established for agents to exchange information, and a refinement stage is included to polish the feedback.

  • User evaluations revealed that MARG-S provides more specific, accurate, and actionable feedback than baseline methods, but there is room for improvement.

  • Future development of MARG-S includes optimizing cost-efficiency, integrating related literature, and further enhancing agent communication for handling larger inputs.

Overview

The multi-agent review generation method, MARG-S, has introduced a means to tackle one of the recent challenges posed by the limitations of LLMs such as GPT-4. This innovative approach delegates the task of generating peer review feedback on scientific papers across multiple instances of a language model. By distributing the text among several "agents," each handling a fragment and communicating with others, MARG-S can handle longer texts effectively. It enhances specificity and helpfulness in feedback by specializing agents to focus on specific aspects of critique such as experimentation, clarity, and impact.

System Design

MARG-S's architecture consists of a designated leader agent orchestrating the process with multiple worker agents, each provided with a section of the scientific paper, and specialized expert agents focusing on different review aspects. The coordination relies on a communication protocol, allowing agents to exchange messages to gather insights across the paper's entirety. The method also includes a crucial refinement stage where initial feedback undergoes a polishing process, improving clarity and ensuring comments are contextually relevant prior to presenting to the user.

User Study Evaluation

In the MARG-S evaluation through a user study, the multi-agent approach showed a remarkable improvement in the quality of generated comments compared with the baseline methods. Feedback from users suggested MARG-S offered specific, accurate, and actionable suggestions. However, while MARG-S surpassed other methods in producing "good" comments, broadly beneficial improvements are still possible, indicated by a notable proportion of comments being deemed as "bad" or "highly inaccurate" across all methods.

Potential and Challenges

The introduction of MARG-S into the domain of scientific review generation reflects a promising leap forward. It not only showcases an advanced application of LLMs but also exhibits a potential model for future enhancement of AI-driven peer-review systems. The increase in the cost of running such multi-agent systems, however, points toward a significant consideration for practical deployment. Future iterations of MARG-S will benefit from optimization for cost and efficiency, the inclusion of related literature for more informed reviews, and advancements in managing the agent communication to handle even larger inputs without overwhelming the system’s capacity. With further refinement, systems like MARG-S could significantly aid scientific communities in the review process, offering more comprehensive, insightful feedback to authors and potentially reshaping the peer review landscape.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.