Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 47 tok/s
Gemini 2.5 Pro 37 tok/s Pro
GPT-5 Medium 15 tok/s Pro
GPT-5 High 11 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 195 tok/s Pro
GPT OSS 120B 465 tok/s Pro
Claude Sonnet 4 30 tok/s Pro
2000 character limit reached

Distributed Rule Vectors is A Key Mechanism in Large Language Models' In-Context Learning (2406.16007v1)

Published 23 Jun 2024 in cs.CL

Abstract: LLMs have demonstrated remarkable abilities, one of the most important being In-Context Learning (ICL). With ICL, LLMs can derive the underlying rule from a few demonstrations and provide answers that comply with the rule. Previous work hypothesized that the network creates a "task vector" in specific positions during ICL. Patching the "task vector" allows LLMs to achieve zero-shot performance similar to few-shot learning. However, we discover that such "task vectors" do not exist in tasks where the rule has to be defined through multiple demonstrations. Instead, the rule information provided by each demonstration is first transmitted to its answer position and forms its own rule vector. Importantly, all the rule vectors contribute to the output in a distributed manner. We further show that the rule vectors encode a high-level abstraction of rules extracted from the demonstrations. These results are further validated in a series of tasks that rely on rules dependent on multiple demonstrations. Our study provides novel insights into the mechanism underlying ICL in LLMs, demonstrating how ICL may be achieved through an information aggregation mechanism.

Citations (1)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube