Emergent Mind

Abstract

We propose a natural language prompt-based retrieval augmented generation (Prompt-RAG), a novel approach to enhance the performance of generative LLMs in niche domains. Conventional RAG methods mostly require vector embeddings, yet the suitability of generic LLM-based embedding representations for specialized domains remains uncertain. To explore and exemplify this point, we compared vector embeddings from Korean Medicine (KM) and Conventional Medicine (CM) documents, finding that KM document embeddings correlated more with token overlaps and less with human-assessed document relatedness, in contrast to CM embeddings. Prompt-RAG, distinct from conventional RAG models, operates without the need for embedding vectors. Its performance was assessed through a Question-Answering (QA) chatbot application, where responses were evaluated for relevance, readability, and informativeness. The results showed that Prompt-RAG outperformed existing models, including ChatGPT and conventional vector embedding-based RAGs, in terms of relevance and informativeness. Despite challenges like content structuring and response latency, the advancements in LLMs are expected to encourage the use of Prompt-RAG, making it a promising tool for other domains in need of RAG methods.

Overview

  • Prompt-RAG is a new method for generating responses in niche fields, using natural language prompts instead of vector embeddings.

  • The process involves creating a Table of Contents (ToC) for document retrieval, selecting relevant headings through a generative model, and then generating a response based on the contextual reference.

  • A Korean Medicine (KM) Question-Answering chatbot was developed to test Prompt-RAG, demonstrating better performance than existing models in relevance and informativeness.

  • The study found that conventional vector embeddings align more closely with token overlaps than with document relatedness, highlighting limitations in niche domain applications.

  • While Prompt-RAG faces challenges like content structuring and response latency, it shows high potential for specialized knowledge retrieval and may improve as LLMs advance.

Introduction to Prompt-RAG

The study in focus presents Prompt-RAG, a novel natural language prompt-based retrieval augmented generation method expressly designed for niche domains, exemplified through its application in Korean Medicine (KM). Traditional Retrieval-Augmented Generation (RAG) models utilize vector embeddings to fetch relevant information necessary for generating responses. However, vector embeddings derived from generic LLMs may not faithfully capture specialized knowledge, an issue that is particularly pronounced in niche fields. Prompt-RAG distinguishes itself from its predecessors by operating without vector embeddings, potentially bypassing this limitation.

Methodology Overview

Prompt-RAG involves three steps: preprocessing, heading selection, and retrieval-augmented generation. The methodology begins with creating a Table of Contents (ToC) from the target document(s), which becomes the basis for retrieval. A large-scale pre-trained generative model assesses the ToC in conjunction with a user query to select the most pertinent headings. This is followed by gathering content correlated with these headings to construct a contextual reference. The generative model then produces a response to the query using this reference. Underlying this process is the advanced natural language understanding possessed by modern LLMs.

Experimentation and Findings

In assessing Prompt-RAG's efficacy, the researchers created a Question-Answering (QA) chatbot for KM. The study found that for KM documents, vector embeddings correlated better with token overlaps rather than human-assessed document relatedness - a trend not observed in Conventional Medicine (CM) embeddings. When comparing Prompt-RAG's QA chatbot to ChatGPT and conventional RAG models, the results indicated superior performance in the aspects of relevance and informativeness, albeit with certain challenges like content structuring and increased response latency.

Implications for Future Research and Application

The findings of the current investigation suggest that Prompt-RAG has considerable potential for applications in specialized domains. By leveraging the linguistic capabilities of LLMs, Prompt-RAG circumvents the limitations associated with the conventional RAG models. Its utility is exemplified in the domain of KM but is not confined to it. The authors anticipate that as generative abilities of LLMs progress and the associated costs decrease, Prompt-RAG could become a powerful tool for information retrieval in a variety of other domains as well. Despite apparent current challenges such as document structuring requirements and longer response times, the researchers remain optimistic about the model's evolving practicality.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.