Emergent Mind

Abstract

As LLMs continue to evolve, more are being designed to handle long-context inputs. Despite this advancement, many models face challenges in achieving high precision on long-context tasks, often showing a lost in the middle'' issue. This paper identifies the root of these issues as a deficiency in retrieval capabilities, exacerbated by the sparsity of key information in long contexts. To tackle this challenge, we introduce a novel approach calledParaphrasing the Original Text'', aimed at augmenting LLMs' proficiency in extracting information from long context. This enhancement is achieved through a specialized supervised fine-tuning stage that incorporates paraphrasing information into training samples, thereby improving the model's retrieval capabilities for long-context scenarios. Testing on datasets like LongBench and NaturalQuestions Multi-document QA dataset, our method demonstrated significant improvements in managing long-context tasks, effectively addressing the ``lost in the middle'' dilemma. Specifically, we observed an average performance increase of 6.4\% and 5.9\% across these datasets, respectively. Moreover, our approach is efficient, requiring minimal overhead with fine-tuning needed on just 19k samples. The model and training data have been made available on HuggingFace(https://huggingface.co/yuyijiong/Qwen-14b-chat-yarn-32k).

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.