Papers
Topics
Authors
Recent
2000 character limit reached

A Survey on In-context Learning (2301.00234v6)

Published 31 Dec 2022 in cs.CL and cs.AI

Abstract: With the increasing capabilities of LLMs, in-context learning (ICL) has emerged as a new paradigm for NLP, where LLMs make predictions based on contexts augmented with a few examples. It has been a significant trend to explore ICL to evaluate and extrapolate the ability of LLMs. In this paper, we aim to survey and summarize the progress and challenges of ICL. We first present a formal definition of ICL and clarify its correlation to related studies. Then, we organize and discuss advanced techniques, including training strategies, prompt designing strategies, and related analysis. Additionally, we explore various ICL application scenarios, such as data engineering and knowledge updating. Finally, we address the challenges of ICL and suggest potential directions for further research. We hope that our work can encourage more research on uncovering how ICL works and improving ICL.

Citations (330)

Summary

  • The paper demonstrates that in-context learning enables pretrained models to make predictions using few-shot examples without updating parameters.
  • It outlines various training strategies, including supervised and self-supervised methods, to enhance the model's ability through optimized demonstration design.
  • The study highlights challenges like sensitivity to demonstration design and scalability, offering insights into robust scoring functions and normalization techniques.

A Survey on In-Context Learning

Introduction

In recent years, LLMs have demonstrated remarkable abilities through in-context learning (ICL), a paradigm that allows models to make predictions by leveraging a context supplemented with a few examples. This survey paper provides a comprehensive examination of the development and challenges associated with ICL, aiming to clarify its definition, explore advanced techniques, and discuss potential research directions.

ICL facilitates learning from analogy by providing demonstration contexts composed of a few examples in natural language format. Unlike traditional supervised learning, which requires parameter updates during a training stage, ICL operates directly on pretrained models without altering their parameters. This paradigm promises data efficiency and ease of knowledge integration by modifying demonstration templates and human-understandable prompts. Figure 1

Figure 1: Illustration of in-context learning. ICL requires a piece of demonstration context containing a few examples written in natural language templates. Taking the demonstration and a query as the input, LLMs are responsible for making predictions.

Formal Definition and Training Strategies

ICL can be formally defined as estimating the likelihood of a potential answer conditioned on a demonstration set using a pretrained LLM. Given a query text, the model selects the candidate answer with the highest estimated probability, determined by a scoring function. The demonstration set includes optional task instructions and demonstration examples, formatted in natural language.

Researchers have explored various methods to enhance ICL capabilities beyond standard language modeling objectives. Supervised In-context Training leverages tasks with contextually framed examples to refine model performance in understanding analogies. Prominent methods such as MetaICL and Symbol Tuning illustrate innovations in utilizing transformed label representations to enhance context-based reasoning. Meanwhile, Self-supervised In-context Training leverages raw corpora to construct self-supervised data, thus bridging the pretraining-inference gap and promoting task generalization (2301.00234).

Demonstration Designing

Demonstration design is central to ICL performance. It spans multiple strategies:

  1. Organization: Selecting and ordering demonstration examples significantly impacts model predictions. Techniques range from kkNN-based unsupervised retrievers to supervised approaches that utilize a scoring mechanism for retrieving optimal examples.
  2. Formatting: Proper formatting enhances understanding in complex reasoning tasks. Instruction formatting and chain-of-thought processes are instrumental in guiding models through intermediate reasoning steps, thereby improving comprehension and task execution (2301.00234).

Challenges and Future Directions

While promising, ICL presents challenges in robustness, efficiency, and scalability. The performance of ICL is highly sensitive to demonstration design, prompting research into more consistent scoring functions and normalization strategies. Moreover, as computational demands increase, optimizing prompting strategies and investigating novel pretraining techniques that enhance ICL-specific capabilities are crucial future directions.

Conclusion

In-context learning has emerged as a powerful tool in the arsenal of AI research, offering a unique approach to deploying LLMs in various domains. This survey underscores the ongoing efforts to refine ICL methodologies and provides insights into future research paths that could bridge gaps in understanding and application, ushering in advancements in both theoretical constructs and practical implementations.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Sign up for free to view the 5 tweets with 220 likes about this paper.

Youtube Logo Streamline Icon: https://streamlinehq.com