Emergent Mind

Sub-goal Distillation: A Method to Improve Small Language Agents

(2405.02749)
Published May 4, 2024 in cs.LG

Abstract

While LLMs have demonstrated significant promise as agents in interactive tasks, their substantial computational requirements and restricted number of calls constrain their practical utility, especially in long-horizon interactive tasks such as decision-making or in scenarios involving continuous ongoing tasks. To address these constraints, we propose a method for transferring the performance of an LLM with billions of parameters to a much smaller language model (770M parameters). Our approach involves constructing a hierarchical agent comprising a planning module, which learns through Knowledge Distillation from an LLM to generate sub-goals, and an execution module, which learns to accomplish these sub-goals using elementary actions. In detail, we leverage an LLM to annotate an oracle path with a sequence of sub-goals towards completing a goal. Subsequently, we utilize this annotated data to fine-tune both the planning and execution modules. Importantly, neither module relies on real-time access to an LLM during inference, significantly reducing the overall cost associated with LLM interactions to a fixed cost. In ScienceWorld, a challenging and multi-task interactive text environment, our method surpasses standard imitation learning based solely on elementary actions by 16.7% (absolute). Our analysis highlights the efficiency of our approach compared to other LLM-based methods. Our code and annotated data for distillation can be found on GitHub.

The figure shows KD using an LLM to generate sub-goals from task descriptions and trajectories.

Overview

  • The paper introduces a method to utilize LLMs for complex interactive tasks efficiently by distilling knowledge into smaller models, structured with a planning and execution module.

  • The hierarchical design allows the system to operate independently after initial training, greatly reducing the need for LLM during runtime and therefore lowering computational costs.

  • Results from experiments in the ScienceWorld environment demonstrated that this dual-module method outperforms traditional imitation learning by 16.7%, showing promise in various practical applications.

Simplifying LLMs for Interactive Tasks Using Hierarchical Agents and Knowledge Distillation

Overview of the Dual Module Approach

The research focuses on an innovative way of utilizing LLMs for complex interactive tasks without incurring high computational costs. By distilling knowledge from an LLM into a much smaller model, the study leverages a two-part system comprising a planning module and an execution module, designed to work in tandem to address intricate problems efficiently.

The Mechanism Behind the Model

The system operates through a hierarchical structure:

  1. Planning Module: This component functions as a high-level policy maker, utilizing a smaller, distilled version of an LLM to create sub-goals essential for solving a task. These sub-goals outline what needs to be achieved step-by-step, providing a clear path forward.
  2. Execution Module: Acting as a low-level policy executor, this module focuses on achieving the sub-goals laid out by the planning module using direct actions.

How It Works

  • The LLM generates a sequence of sub-goals by analyzing an oracle path toward task completion.
  • These sub-goals and their corresponding elementary actions are used to fine-tune the planning and execution modules.
  • Notably, real-time access to the LLM isn't needed after these modules are fine-tuned, which drastically cuts down the ongoing computational costs.

Significant Results Highlighted

The integration of this hierarchical agent design delivers commendable results:

  • In ScienceWorld, an environment known for its complexity, this method surpassed traditional imitation learning performances by a notable margin of 16.7%.
  • It showcases efficiency not only in computational resources but also in the flexibility and applicability across various scenarios without continuous LLM dependency.

Practical Implications

Cost Efficiency

By reducing dependency on LLM queries during runtime, the model addresses significant cost constraints, making it viable for longer-term or continuous interactive tasks that would otherwise be computationally prohibitive.

Task-Specific Adaptation

The modular nature of the smaller language models allows for adjustments tailored to specific tasks, enhancing both performance and applicability in diverse applications, from mobile apps to embedded systems in robotics.

Theoretical Implications

The use of Knowledge Distillation (KD) to transfer intricate decision-making capabilities from a large model to a smaller one could open new avenues in AI training methodologies. This approach illustrates how distilled models could potentially retain or even enhance the decision-making prowess of their progenitors without bearing the same operational costs.

Speculating on Future Developments

As AI continues to evolve, models like these could lead to more sustainable and economical AI deployments in environments where continuous learning and interaction are required. Further refinement could lead to models that can dynamically adjust their goals based on changing circumstances, enhancing their decision-making abilities in real-time.

Continued advances in KD techniques may allow even smaller models to perform tasks previously thought only manageable by significantly larger systems, broadening the scope of AI's applicability in resource-constrained settings.

Conclusion

This study provides a compelling look at how hierarchical agent structures and knowledge distillation can effectively reduce the reliance and computational demand of LLMs in complex interactive environments. The promising results not only bolster the case for more cost-effective AI models but also open intriguing possibilities for future research in AI efficiency and adaptability.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.