Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

139 tokens/sec

GPT-4o

47 tokens/sec

Gemini 2.5 Pro Pro

43 tokens/sec

o3 Pro

4 tokens/sec

GPT-4.1 Pro

47 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

1.2k 7 19 2.0k

Retrieval-Augmented Generation for Large Language Models: A Survey (2312.10997v5)

Published 18 Dec 2023 in cs.CL and cs.AI

Abstract: LLMs showcase impressive capabilities but encounter challenges like hallucination, outdated knowledge, and non-transparent, untraceable reasoning processes. Retrieval-Augmented Generation (RAG) has emerged as a promising solution by incorporating knowledge from external databases. This enhances the accuracy and credibility of the generation, particularly for knowledge-intensive tasks, and allows for continuous knowledge updates and integration of domain-specific information. RAG synergistically merges LLMs' intrinsic knowledge with the vast, dynamic repositories of external databases. This comprehensive review paper offers a detailed examination of the progression of RAG paradigms, encompassing the Naive RAG, the Advanced RAG, and the Modular RAG. It meticulously scrutinizes the tripartite foundation of RAG frameworks, which includes the retrieval, the generation and the augmentation techniques. The paper highlights the state-of-the-art technologies embedded in each of these critical components, providing a profound understanding of the advancements in RAG systems. Furthermore, this paper introduces up-to-date evaluation framework and benchmark. At the end, this article delineates the challenges currently faced and points out prospective avenues for research and development.

References (111)

Citations (920)

View on Semantic Scholar

Summary

The paper introduces Retrieval-Augmented Generation to combine LLM parametric knowledge with external data, significantly reducing hallucinations.
It examines evolving paradigms from Naive to Advanced and Modular RAG, showing how iterative and recursive retrieval methods improve performance.
It employs evaluation frameworks like RAGAS to measure relevance and faithfulness, and outlines future enhancements for diverse domain applications.

Overview of Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) combines parametric knowledge from LLMs with non-parametric knowledge sourced externally to enhance the generation of text. By linking generated output to external data, RAG provides a verifiable foundation for the information provided, often reducing hallucination issues where models generate false information. RAG is adaptable, making it a powerful tool for providing up-to-date information and transparent outputs that can be traced back to the source material.

Paradigms of RAG

RAG development has undergone a transition from Naive RAG to more sophisticated paradigms including Advanced RAG and Modular RAG. Naive RAG involves a retrieval process that retrieves relevant documents, which a generator then uses to create text responses. Despite the effectiveness of this process, its limitations pave the way for Advanced RAG. Advanced RAG addresses these limitations by optimizing the retrieval process with methods such as pre-indexing optimization and refinement of the retrieval process with techniques like recursive retrieval.

Modular RAG further advances the concept by allowing the integration of various modules which can be reconfigured based on specific tasks, offering greater flexibility and efficiency.

Core Components and Evaluation of RAG

Research on RAG spans across retrievers and generators. The core focus is on fine-tuning both components to improve answer accuracy and relevance. For instance, RAG with iterative retrieval refines the retrieval process, potentially yielding more relevant and concise information which enhances LLM performance. In terms of evaluation, frameworks like RAGAS and ARES analyze RAG systems employing metrics such as Faithfulness, Relevance, and Context Recall to measure effectiveness.

Future Directions and Horizontal Expansion

RAG has potential for vertical optimization, such as addressing long context limitations and improving robustness. Horizontal expansion has seen RAG applied across diverse domains from images to code, showcasing its flexibility and applicability. Finally, the growth of the RAG ecosystem, including technical stacks and tools, points to a future where an all-encompassing RAG platform could be a reality, maximizing the synergy between parametric and non-parametric methods and aligning with engineering needs.

The continuous improvement and diversification of RAG use cases are likely to further its performance and practical applications, creating a more powerful tool in the landscape of generative AI.

PDF Markdown

GitHub

GitHub - Tongji-KGLLM/RAG-Survey (2,039 stars)

Tweets

https://twitter.com/helloiamleonie/status/1759608020114211323

https://twitter.com/3448284313/status/1738354427759612222

https://twitter.com/pvergadia/status/1785707754566586457

https://twitter.com/sijak5/status/1744255450764210518

https://twitter.com/hwchase17/status/1754983392876315032

https://twitter.com/Stephan007/status/1752434764605223276

YouTube

Show All Videos

HackerNews

Retrieval-Augmented Generation for Large Language Models: A Survey (2 points, 0 comments)
Retrieval-Augmented Generation for Large Language Models: A Survey (2 points, 1 comment)
Understanding of RAG: core paradigms, key technologies, and future trends (2 points, 0 comments)
Evolution of RAG (1 point, 0 comments)