In-context Learning with Retrieved Demonstrations for Language Models: A Survey (2401.11624v5)

Published 21 Jan 2024 in cs.CL, cs.AI, and cs.IR

Abstract: LLMs, especially pre-trained LLMs, have showcased remarkable abilities as few-shot in-context learners (ICL), adept at adapting to new tasks with just a few demonstrations in the input context. However, the model's ability to perform ICL is sensitive to the choice of the few-shot demonstrations. Instead of using a fixed set of demonstrations, one recent development is to retrieve demonstrations tailored to each input query. The implementation of demonstration retrieval is relatively straightforward, leveraging existing databases and retrieval systems. This not only improves the efficiency and scalability of the learning process but also has been shown to reduce biases inherent in manual example selection. In light of the encouraging results and growing research in ICL with retrieved demonstrations, we conduct an extensive review of studies in this area. In this survey, we discuss and compare different design choices for retrieval models, retrieval training procedures, and inference algorithms.

References (132)

Citations (31)

View on Semantic Scholar

Summary

The paper presents an exhaustive survey detailing how retrieval-based strategies enhance the selection of demonstrations to optimize LLM performance.
It reviews methodologies like one-shot, clustering, and iterative retrieval to balance demonstration diversity and relevance.
The study highlights applications in natural language understanding, reasoning, and text generation while outlining key challenges and future research directions.

Introduction

Few-shot in-context learning (ICL) is a capability of LLMs to adapt to new tasks using a limited number of demonstrations. This capability eludes the need for task-specific fine-tuning, presenting several advantages including resource efficiency and mitigating overfitting. Traditional ICL approaches use fixed demonstrations for all queries, leading to suboptimal use of the LLMs' potential. This survey provides an exhaustive review of a burgeoning variation: Retrieval-based ICL (RetICL), where tailored demonstrations are selected dynamically for each query to optimize model performance.

Few-shot In-context Learning for LLMs

LLMs have been pre-eminent in handling few-shot learning scenarios where models infer based on a handful of demonstrations without parameter updating. Despite significant strides, the success of ICL hinges on the quality, quantity and diversity of demonstrations. This calls for techniques that shift from static to dynamic, query-oriented demonstration selection. RetICL intends to maximize the relevance and usefulness of these demonstrations by considering key elements such as retrieval model complexity, diversity of the retrieval corpus and the retriever’s objectives during the selection process.

In-context Learning with Demonstration Retrieval

RetICL involves selecting demonstrations in alignment with the input query. Various strategies materialize this objective, with methods spanning one-hoc retrieval, clustering retrieval, and iterative retrieval. They differ in how demonstrations are selected—either independently, through clustering for diversity, or iteratively to build upon the context of previously chosen demonstrations. The retrieval corpus is equally vital and can range from in-domain, mix-domain, and cross-domain to raw text and unlabelled queries. Advanced RetICL techniques employ fine-tuning of retrieval models to curate training data, with the focus shifting towards training objectives that can range from InfoNCE loss to distillation by KL divergence, aiming for relevancy and diversity.

Applications and Future Directions

RetICL has established efficacy across several task categories including natural language understanding, reasoning, knowledge-based QA, and text generation. Challenges persist in corpus creation, retriever choice, training methods, and the need for active learning integration. Future research must resolve these challenges while also enhancing our theoretical comprehension of why similar demonstrations derived from retrieval methods outperform random selection, and how RetICL can be adapted to smaller models without sacrificing performance.

The insights gleaned from RetICL expose the remarkable yet untapped potential of LLMs when demonstrations are chosen with an informed, context-sensitive retrieval strategy. The exploration of this domain is poised to refine our understanding and utilization of LLMs, pushing the boundaries of AI and its application within complex, resource-constrained environments. This survey encapsulates the current state while guiding future developments in the evolving landscape of in-context learning.

PDF Markdown

Tweets

https://twitter.com/_reachsumit/status/1749640183178514938

https://twitter.com/manluo12/status/1752042621781078179

https://twitter.com/fly51fly/status/1749919367343833566

https://twitter.com/TechTweetBot/status/1776870316251263215

HackerNews

In-Context Learning with Retrieved Demonstrations for Language Models (1 point, 0 comments)