CompAct: Compressing Retrieved Documents Actively for Question Answering (2407.09014v3)

Published 12 Jul 2024 in cs.CL

Abstract: Retrieval-augmented generation supports LLMs to strengthen their factual groundings by providing external contexts. However, LLMs often face challenges when given extensive information, diminishing their effectiveness in solving questions. Context compression tackles this issue by filtering out irrelevant information, but current methods still struggle in realistic scenarios where crucial information cannot be captured with a single-step approach. To overcome this limitation, we introduce CompAct, a novel framework that employs an active strategy to condense extensive documents without losing key information. Our experiments demonstrate that CompAct brings significant improvements in both performance and compression rate on multi-hop question-answering benchmarks. CompAct flexibly operates as a cost-efficient plug-in module with various off-the-shelf retrievers or readers, achieving exceptionally high compression rates (47x).

Citations (4)

View on Semantic Scholar

Summary

The paper introduces an active context compression strategy that iteratively updates document segments to preserve essential information for accurate QA.
The paper employs an early termination mechanism to halt unnecessary processing, optimizing efficiency while maintaining high performance.
It functions as a cost-efficient plug-in module, achieving a 47x compression rate and a 7.0-point F1 boost on benchmarks like HotpotQA.

CompAct: Compressing Retrieved Documents Actively for Question Answering

CompAct introduces a novel framework for efficiently handling extensive document contexts in retrieval-augmented question answering (QA). While conventional approaches in this domain provide LLMs with external contexts to enhance factual grounding, they face significant challenges when dealing with large volumes of information. The primary issue lies in the models' difficulty in extracting key information and integrating it from across multiple documents, especially in multi-hop QA scenarios where reasoning over several documents is crucial.

Key Contributions

Active Compression Strategy: The authors propose an active strategy for context compression, consisting of sequential updates of contexts based on newly provided segments while integrating previously compressed contexts. This iterative method ensures that essential information is preserved through each step, creating a compact context for the QA task.
Early Termination Mechanism: To avoid processing unnecessary information, CompAct introduces an early termination mechanism. This involves an evaluation at each step, deciding whether the gathered context is complete or if further information is necessary to answer the query. This dynamic approach prevents redundant information from entering the context and optimizes computational efficiency.
Cost-Efficient Plug-In Module: The framework is designed to function as a cost-efficient plug-in module compatible with various existing retrieval systems and reader models. It achieves significant compression rates while maintaining high performance across different QA benchmarks, demonstrating its versatility and effectiveness.

Experimental Results

The framework was evaluated on a variety of QA benchmarks, highlighting substantial improvements over existing methods:

Compression Rate: CompAct achieved exceptionally high compression rates, with an average of 47x, significantly reducing the token count compared to raw document inputs.
Performance Metrics: The framework demonstrated robust performance in multi-document QA tasks, outperforming existing compression methods by a significant margin. For instance, on the HotpotQA dataset, CompAct achieved a 7.0-point improvement in F1 score over baselines.

Implications and Future Directions

The practical implications of CompAct are profound, given its ability to streamline the handling of large contexts, making it highly beneficial for applications requiring efficient processing of extensive textual data. From a theoretical standpoint, the active compression strategy and early termination mechanism contribute to a better understanding of how to manage and integrate scattered information for complex QA tasks.

Generalizability and Flexibility

CompAct was evaluated with different retrieval systems like Contriever and BM25 and demonstrated consistent performance improvements across varying configurations. Moreover, it was tested with several reader models, including GPT-3.5-Turbo and LLaMA-3-8B, showing stable performance and cost efficiency when employed with these models.

Component Analysis and Cost Efficiency

An ablation paper underscored the importance of the condition token in balancing performance and compression rate. Additionally, CompAct was shown to greatly reduce the API usage costs of proprietary models, proving its economic viability in real-world applications.

Conclusion

CompAct provides a significant advancement in retrieval-augmented QA by actively managing large document contexts and integrating essential information. Its high compression rate, coupled with robust performance across diverse retrieval and reader configurations, demonstrates its potential as a versatile and efficient solution for complex QA tasks. Future research may focus on refining the early termination mechanism and exploring its application in various domains requiring extensive context handling.

PDF Markdown

Related Papers

Tweets

https://twitter.com/_reachsumit/status/1812675642653348010

https://twitter.com/jeongminby98858/status/1836946564721561648

https://twitter.com/jeongminby98858/status/1812957560716005647

https://twitter.com/cw_yoon99/status/1837063770406785385

https://twitter.com/gm8xx8/status/1812677191873798256