Emergent Mind

Learning to Retrieve Iteratively for In-Context Learning

(2406.14739)
Published Jun 20, 2024 in cs.CL

Abstract

We introduce iterative retrieval, a novel framework that empowers retrievers to make iterative decisions through policy optimization. Finding an optimal portfolio of retrieved items is a combinatorial optimization problem, generally considered NP-hard. This approach provides a learned approximation to such a solution, meeting specific task requirements under a given family of LLMs. We propose a training procedure based on reinforcement learning, incorporating feedback from LLMs. We instantiate an iterative retriever for composing in-context learning (ICL) exemplars and apply it to various semantic parsing tasks that demand synthesized programs as outputs. By adding only 4M additional parameters for state encoding, we convert an off-the-shelf dense retriever into a stateful iterative retriever, outperforming previous methods in selecting ICL exemplars on semantic parsing datasets such as CalFlow, TreeDST, and MTOP. Additionally, the trained iterative retriever generalizes across different inference LLMs beyond the one used during training.

ICL prompt construction using BM25 versus a trained iterative retriever in SMCalFlow.

Overview

  • The paper introduces an iterative retrieval framework designed to enhance in-context learning by optimizing exemplar selection through policy optimization, addressing the NP-hard nature of the problem.

  • The framework employs reinforcement learning with feedback from LLMs, utilizing techniques like Proximal Policy Optimization to train the retriever.

  • Empirical results show significant improvements in performance on datasets such as SMCalFlow, TreeDST, and MTOP, demonstrating the framework's efficacy and potential for broader applicability.

Iterative Retrieval Framework for In-Context Learning

The paper "Learning to Retrieve Iteratively for In-Context Learning" presents a novel framework termed iterative retrieval, designed to enhance the retrieval of exemplars for in-context learning (ICL). In this framework, retrievers make iterative decisions via policy optimization, addressing the challenge of selecting an optimal set of retrieved items, a problem generally considered NP-hard.

Key Contributions

This work introduces several innovative elements to the ICL paradigm:

  1. Iterative Retrieval Framework: The proposed framework allows for a sequence of retrieval calls, leveraging different query vectors at each step. This iterative approach builds a trajectory of exemplar selections, optimizing the selection for enhancing ICL performance.
  2. Reinforcement Learning for Training: The training procedure is implemented using reinforcement learning (RL), with feedback from LLMs serving as the environment. This method aims to maximize rewards associated with the effectiveness of composed prompts.
  3. Enhanced Semantic Parsing: The iterative retriever is instantiated for the task of semantic parsing, which requires a high degree of compositionality and synthesis in outputs. The framework demonstrates superior results in datasets like SMCalFlow, TreeDST, and MTOP.

Theoretical and Practical Implications

Theoretical Contributions

  • Markov Decision Processes: By formulating the retrieval task as a Markov Decision Process (MDP), the study leverages well-established RL techniques to optimize the sequence of exemplar selections.
  • Policy Optimization: The proposed method utilizes Proximal Policy Optimization (PPO) to train the iterative retriever. The policy directs the selection of exemplars based on the state transitions, projected through a GRU (Gated Recurrent Unit).

Practical Contributions

  • Stateful Retrievers: The retriever maintains an internal state, allowing subsequent retrieval actions to be influenced by previously selected exemplars. This stateful approach significantly outperforms traditional off-the-shelf retrievers.
  • Reduced Computational Complexity: With the introduction of iterative steps, this method provides a learned approximation to the optimal solution, thus alleviating computational heaviness associated with combinatorial optimization problems.
  • Generalization Across LLMs: The research demonstrates that iteratively trained retrievers can generalize across different LLMs, including those not used during training. This capability is crucial for practical applications where retrievers need to adapt to various LLM architectures and sizes.

Strong Numerical Results

The empirical results substantiate the efficacy of the proposed iterative retrieval framework:

  • On the SMCalFlow dataset, the IterR framework achieves an Exact Match (EM@1) score of 54.1%, significantly outperforming traditional methods such as BM25 (39.8%) and Contriever (44.0%).
  • In the TreeDST dataset, IterR yields an EM@1 of 58.2%, superior to previous methods including EPR (54.0%) and CEIL (56.2%).
  • Similar improvements are observed in the MTOP dataset, with IterR achieving an EM@1 score of 63.9%.

These substantial gains indicate that iterative retrievers are adept at constructing prompts that significantly bolster ICL and downstream LLM generation tasks.

Future Developments

Moving forward, the iterative retrieval framework opens several avenues for exploration:

  • Dynamic State Transitions: More sophisticated state transition models, such as those incorporating Transformer decoders, could further enhance the model's capability to navigate the retrieval space.
  • Structured Reward Functions: Tailoring reward functions to specific downstream tasks (e.g., structured program synthesis) can potentially yield even better performance.
  • Broader Task Applicability: Extending the framework to other LLM tasks beyond semantic parsing could validate its utility and adaptability in a wide array of applications.

Conclusion

The paper's iterative retrieval framework offers a compelling approach to enhancing ICL through stateful and policy-optimized retrievers. By effectively addressing the NP-hard problem of optimal exemplar selection, this research provides an important tool for advancing the performance and adaptability of LLMs in various semantic parsing and program synthesis tasks. Future research directions include refining state transition models and reward functions to further elevate the effectiveness of iterative retrievers.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.