Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 147 tok/s

Gemini 2.5 Pro 52 tok/s Pro

GPT-5 Medium 27 tok/s Pro

GPT-5 High 30 tok/s Pro

GPT-4o 96 tok/s Pro

Kimi K2 188 tok/s Pro

GPT OSS 120B 398 tok/s Pro

Claude Sonnet 4.5 36 tok/s Pro

2000 character limit reached

Learning to Retrieve Iteratively for In-Context Learning (2406.14739v1)

Published 20 Jun 2024 in cs.CL

Abstract: We introduce iterative retrieval, a novel framework that empowers retrievers to make iterative decisions through policy optimization. Finding an optimal portfolio of retrieved items is a combinatorial optimization problem, generally considered NP-hard. This approach provides a learned approximation to such a solution, meeting specific task requirements under a given family of LLMs. We propose a training procedure based on reinforcement learning, incorporating feedback from LLMs. We instantiate an iterative retriever for composing in-context learning (ICL) exemplars and apply it to various semantic parsing tasks that demand synthesized programs as outputs. By adding only 4M additional parameters for state encoding, we convert an off-the-shelf dense retriever into a stateful iterative retriever, outperforming previous methods in selecting ICL exemplars on semantic parsing datasets such as CalFlow, TreeDST, and MTOP. Additionally, the trained iterative retriever generalizes across different inference LLMs beyond the one used during training.

Citations (1)

View on Semantic Scholar

Summary

The paper introduces an iterative retrieval framework that optimizes exemplar selection via reinforcement learning, yielding improved in-context learning performance.
It formulates exemplar selection as a Markov Decision Process and employs Proximal Policy Optimization to navigate the NP-hard retrieval problem.
Stateful retrievers dynamically build exemplar trajectories, achieving significant empirical gains on datasets like SMCalFlow and TreeDST.

Iterative Retrieval Framework for In-Context Learning

The paper "Learning to Retrieve Iteratively for In-Context Learning" presents a novel framework termed iterative retrieval, designed to enhance the retrieval of exemplars for in-context learning (ICL). In this framework, retrievers make iterative decisions via policy optimization, addressing the challenge of selecting an optimal set of retrieved items, a problem generally considered NP-hard.

Key Contributions

This work introduces several innovative elements to the ICL paradigm:

Iterative Retrieval Framework: The proposed framework allows for a sequence of retrieval calls, leveraging different query vectors at each step. This iterative approach builds a trajectory of exemplar selections, optimizing the selection for enhancing ICL performance.
Reinforcement Learning for Training: The training procedure is implemented using reinforcement learning (RL), with feedback from LLMs serving as the environment. This method aims to maximize rewards associated with the effectiveness of composed prompts.
Enhanced Semantic Parsing: The iterative retriever is instantiated for the task of semantic parsing, which requires a high degree of compositionality and synthesis in outputs. The framework demonstrates superior results in datasets like SMCalFlow, TreeDST, and MTOP.

Theoretical and Practical Implications

Theoretical Contributions

Markov Decision Processes: By formulating the retrieval task as a Markov Decision Process (MDP), the paper leverages well-established RL techniques to optimize the sequence of exemplar selections.
Policy Optimization: The proposed method utilizes Proximal Policy Optimization (PPO) to train the iterative retriever. The policy directs the selection of exemplars based on the state transitions, projected through a GRU (Gated Recurrent Unit).

Practical Contributions

Stateful Retrievers: The retriever maintains an internal state, allowing subsequent retrieval actions to be influenced by previously selected exemplars. This stateful approach significantly outperforms traditional off-the-shelf retrievers.
Reduced Computational Complexity: With the introduction of iterative steps, this method provides a learned approximation to the optimal solution, thus alleviating computational heaviness associated with combinatorial optimization problems.
Generalization Across LLMs: The research demonstrates that iteratively trained retrievers can generalize across different LLMs, including those not used during training. This capability is crucial for practical applications where retrievers need to adapt to various LLM architectures and sizes.

Strong Numerical Results

The empirical results substantiate the efficacy of the proposed iterative retrieval framework:

On the SMCalFlow dataset, the IterR framework achieves an Exact Match (EM@1) score of 54.1%, significantly outperforming traditional methods such as BM25 (39.8%) and Contriever (44.0%).
In the TreeDST dataset, IterR yields an EM@1 of 58.2%, superior to previous methods including EPR (54.0%) and CEIL (56.2%).
Similar improvements are observed in the MTOP dataset, with IterR achieving an EM@1 score of 63.9%.

These substantial gains indicate that iterative retrievers are adept at constructing prompts that significantly bolster ICL and downstream LLM generation tasks.

Future Developments

Moving forward, the iterative retrieval framework opens several avenues for exploration:

Dynamic State Transitions: More sophisticated state transition models, such as those incorporating Transformer decoders, could further enhance the model's capability to navigate the retrieval space.
Structured Reward Functions: Tailoring reward functions to specific downstream tasks (e.g., structured program synthesis) can potentially yield even better performance.
Broader Task Applicability: Extending the framework to other LLM tasks beyond semantic parsing could validate its utility and adaptability in a wide array of applications.

Conclusion

The paper's iterative retrieval framework offers a compelling approach to enhancing ICL through stateful and policy-optimized retrievers. By effectively addressing the NP-hard problem of optimal exemplar selection, this research provides an important tool for advancing the performance and adaptability of LLMs in various semantic parsing and program synthesis tasks. Future research directions include refining state transition models and reward functions to further elevate the effectiveness of iterative retrievers.