Emergent Mind

Abstract

Open-sourced LLMs have achieved great success in various NLP tasks, however, they are still far inferior to API-based models when acting as agents. How to integrate agent ability into general LLMs becomes a crucial and urgent problem. This paper first delivers three key observations: (1) the current agent training corpus is entangled with both formats following and agent reasoning, which significantly shifts from the distribution of its pre-training data; (2) LLMs exhibit different learning speeds on the capabilities required by agent tasks; and (3) current approaches have side-effects when improving agent abilities by introducing hallucinations. Based on the above findings, we propose Agent-FLAN to effectively Fine-tune LANguage models for Agents. Through careful decomposition and redesign of the training corpus, Agent-FLAN enables Llama2-7B to outperform prior best works by 3.5\% across various agent evaluation datasets. With comprehensively constructed negative samples, Agent-FLAN greatly alleviates the hallucination issues based on our established evaluation benchmark. Besides, it consistently improves the agent capability of LLMs when scaling model sizes while slightly enhancing the general capability of LLMs. The code will be available at https://github.com/InternLM/Agent-FLAN.

Agent-FLAN outperforms AgentTuning in understanding API information and providing preferred responses on Toolbench and Agent-H datasets.

Overview

  • Agent-FLAN is a fine-tuning methodology devised to boost the agent capabilities of LLMs by addressing challenges in agent training data and introducing novel fine-tuning techniques.

  • Three key observations were made: the entanglement of agent training data, variable learning speeds across agent tasks, and unintended side effects of current enhancement approaches such as hallucination.

  • The methodology achieved a 3.5% improvement in agent tasks using the Llama2-7B model, demonstrating its efficacy in enhancing agent abilities without compromising general LLM capabilities.

  • Agent-FLAN signifies a step towards closing the performance gap between open-sourced LLMs and API-based models, suggesting future research directions for integrating agent functions into LLMs.

Enhancing Agent Abilities in LLMs with Agent-FLAN

Introduction to Agent-FLAN

The quest to imbue LLMs with robust agent capabilities has led to the development of Agent-FLAN, a fine-tuning methodology designed to effectively enhance LLMs' performance in agent tasks. The research stems from the observation that while open-sourced LLMs demonstrate exceptional proficiency in natural language understanding and generation, their ability to act as agents—making decisions based on environmental inputs and executing tasks—lags behind that of their API-based counterparts. Agent-FLAN (Fine-tuning LANguage models for Agents) addresses this gap by refining the agent training corpus and introducing novel fine-tuning techniques tailored for agent tasks.

Key Observations and Methodology

The development of Agent-FLAN was guided by three pivotal observations, each highlighting specific challenges and opportunities in agent tuning:

  1. Entanglement of Agent Training Data: The study found that most agent training data mixes format adherence with agent reasoning, diverging significantly from the pre-training data distribution. This misalignment complicates the learning process for LLMs, constraining their ability to acquire agent-specific skills effectively.
  2. Variable Learning Speeds: LLMs exhibit different learning velocities across various agent-related capabilities. This discrepancy suggests a need for tailored training approaches that account for the unique learning dynamics of each capability.
  3. Side-Effects of Existing Approaches: Current strategies to enhance agent abilities in LLMs often lead to unintended consequences, such as the introduction of hallucinations—misleading, inaccurate, or irrelevant outputs.

To navigate these challenges, Agent-FLAN employs a multi-faceted approach:

  • Alignment with Natural Language Domain: By restructuring agent training data to resemble natural conversations, Agent-FLAN mitigates the issue of data entanglement, facilitating more effective learning of agent abilities.
  • Decomposition and Balanced Training: The methodology breaks down agent tasks into fundamental capabilities and adjusts the training focus according to the distinct learning rates of these capabilities.
  • Mitigation of Hallucinations: Through the creation of an evaluation benchmark for hallucination and the incorporation of negative samples, Agent-FLAN significantly reduces the occurrence of hallucination in LLM outputs.

Empirical Validation and Results

Agent-FLAN's efficacy is demonstrated through a series of comprehensive experiments using the Llama2-7B model across various agent evaluation benchmarks. The approach achieved a 3.5\% improvement over previous works, showcasing its potential to significantly enhance the agent capabilities of LLMs. Additionally, Agent-FLAN was found to not only boost agent-specific abilities but also slightly improve the general capabilities of LLMs, underscoring the versatile benefits of the proposed fine-tuning methodology.

Implications and Future Directions

The success of Agent-FLAN in enhancing agent abilities of LLMs has several important implications:

  • Bridging the Gap: The methodology represents a significant step toward narrowing the performance gap between open-sourced LLMs and API-based models in agent tasks.
  • Flexible Learning: The differentiated learning strategies for various agent capabilities highlight the importance of adaptable training methods in maximizing LLMs' potential.
  • Holistic Model Improvement: The positive impact of Agent-FLAN on both agent and general capabilities of LLMs suggests a pathway for developing more universally competent models.

Looking ahead, the insights gained from Agent-FLAN pave the way for further exploration in integrating effective agent functions into LLMs. Future research may delve into more granular training data decomposition, examine the scalability of Agent-FLAN across larger model sizes, and explore its applicability to a broader range of agent tasks.

In conclusion, Agent-FLAN offers a promising avenue for fortifying the agent capabilities of LLMs, marking an important advancement in the pursuit of more intelligent and versatile AI agents.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.