Implicit In-context Learning (2405.14660v2)

Published 23 May 2024 in cs.LG, cs.AI, and cs.CL

Abstract: In-context Learning (ICL) empowers LLMs to swiftly adapt to unseen tasks at inference-time by prefixing a few demonstration examples before queries. Despite its versatility, ICL incurs substantial computational and memory overheads compared to zero-shot learning and is sensitive to the selection and order of demonstration examples. In this work, we introduce Implicit In-context Learning (I2CL), an innovative paradigm that reduces the inference cost of ICL to that of zero-shot learning with minimal information loss. I2CL operates by first generating a condensed vector representation, namely a context vector, extracted from the demonstration examples. It then conducts an inference-time intervention through injecting a linear combination of the context vector and query activations back into the model's residual streams. Empirical evaluation on nine real-world tasks across three model architectures demonstrates that I2CL achieves few-shot level performance at zero-shot inference cost, and it exhibits robustness against variations in demonstration examples. Furthermore, I2CL facilitates a novel representation of task-ids, enhancing task similarity detection and fostering effective transfer learning. We also perform a comprehensive analysis and ablation study on I2CL, offering deeper insights into its internal mechanisms. Code is available at https://github.com/LzVv123456/I2CL.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

GitHub

GitHub - LzVv123456/I2CL (27 stars)

Tweets

https://twitter.com/lzvv123456/status/1914392936646590879

https://twitter.com/HaoGarfield/status/1915470241968382303

https://twitter.com/lzvv123456/status/1914327227694538966

Implicit In-context Learning (2405.14660v2)

Summary

Related Papers

GitHub

Tweets