Mitigating Entity-Level Hallucination in Large Language Models (2407.09417v2)

Published 12 Jul 2024 in cs.CL and cs.IR

Abstract: The emergence of LLMs has revolutionized how users access information, shifting from traditional search engines to direct question-and-answer interactions with LLMs. However, the widespread adoption of LLMs has revealed a significant challenge known as hallucination, wherein LLMs generate coherent yet factually inaccurate responses. This hallucination phenomenon has led to users' distrust in information retrieval systems based on LLMs. To tackle this challenge, this paper proposes Dynamic Retrieval Augmentation based on hallucination Detection (DRAD) as a novel method to detect and mitigate hallucinations in LLMs. DRAD improves upon traditional retrieval augmentation by dynamically adapting the retrieval process based on real-time hallucination detection. It features two main components: Real-time Hallucination Detection (RHD) for identifying potential hallucinations without external models, and Self-correction based on External Knowledge (SEK) for correcting these errors using external knowledge. Experiment results show that DRAD demonstrates superior performance in both detecting and mitigating hallucinations in LLMs. All of our code and data are open-sourced at https://github.com/oneal2000/EntityHallucination.

Citations (1)

View on Semantic Scholar

Summary

The paper introduces DRAD, a dual-component method combining Real-time Hallucination Detection (RHD) and Self-correction via External Knowledge (SEK) to prevent entity-level inaccuracies.
It details a methodology leveraging token probability and entropy analysis to identify low-probability, high-entropy entities, achieving an AUC of 89.31 on WikiBio GPT-3.
Experimental results demonstrate DRAD’s efficiency and robustness across multiple QA benchmarks, reducing average retrieval calls while enhancing factual correctness.

Mitigating Entity-Level Hallucination in LLMs

The paper "Mitigating Entity-Level Hallucination in LLMs" addresses a prevalent challenge in NLP, specifically the issue of hallucination in LLMs. This phenomenon, where an LLM generates text that is coherent but factually incorrect, significantly undermines user trust in LLM-based applications.

The authors introduce a novel method, Dynamic Retrieval Augmentation based on hallucination Detection (DRAD), designed to detect and mitigate hallucinations in real-time during the LLM's text generation process. The proposed approach builds upon traditional Retrieval-Augmented Generation (RAG) methods by dynamically adapting the retrieval process based on real-time hallucination detection.

Methodology

Real-time Hallucination Detection (RHD)

A central component of DRAD is the Real-time Hallucination Detection (RHD) mechanism. RHD enables the immediate identification of potential hallucinations without relying on external models, thus preserving computational efficiency. The core idea is to analyze the uncertainty in the LLM's output entities. The detection of hallucinations involves evaluating the probability of an entity and its entropy during generation. Entities with low probability and high entropy are marked as potential hallucinations.

Self-correction based on External Knowledge (SEK)

Once a hallucination is detected, the Self-correction based on External Knowledge (SEK) mechanism is triggered. SEK corrects the hallucinated output by retrieving relevant external knowledge and integrating it back into the LLM’s text generation. This process involves formulating a search query based on the context where hallucination occurs, retrieving pertinent documents from an external corpus (e.g., Wikipedia), and revising the output to mitigate the hallucination.

Experimental Results

The experimental evaluation demonstrates that DRAD outperforms existing single-round and multiple-round retrieval augmentation methods on multiple question-answering (QA) benchmark datasets, including 2WikiMultihopQA, StrategyQA, and NQ.

Hallucination Detection

The RHD method exhibits state-of-the-art (SOTA) performance in hallucination detection. When tested on the WikiBio GPT-3 dataset, it achieved an AUC score of 89.31, surpassing other baseline methods such as SelfCheckGPT variants and predictive probability-based methods. The average pooling method for probability yields the best performance, underscoring its efficacy.

Text Generation

DRAD showcases significant improvements across diverse datasets. On the 2WikiMultihopQA dataset, the method achieves an F1 score of 0.4732 and an exact match (EM) score of 0.39 with just 1.40 retrievals on average, markedly efficient compared to methods like FLR, which demands extensive retrieval calls. Furthermore, DRAD excels on simpler datasets like NQ and StrategyQA, demonstrating its versatility and robustness.

Discussion and Future Directions

While DRAD efficiently mitigates hallucinations in LLMs, it primarily addresses hallucinations arising from knowledge gaps, rather than those due to erroneous pre-training knowledge. Future research could focus on developing detection mechanisms that can differentiate between these two types of hallucinations. Additionally, given that the real-time detection requires token probability data, which may not always be accessible, new methods that circumvent this limitation should be explored.

The practical implications of DRAD are substantial. By enhancing the factual accuracy of LLM-generated text, it can significantly improve user trust and the applicability of LLMs in various domains, from automated customer service to academic research.

Conclusions

The paper presents a comprehensive framework that not only detects hallucinations in real-time but also effectively mitigates them using external knowledge. The dual components of RHD and SEK within DRAD synergize to form a robust solution to a chronic issue in LLMs. Future developments could extend its capabilities and address current limitations, paving the way for more reliable and trustworthy AI applications.

PDF Markdown

Related Papers

GitHub

GitHub - oneal2000/EntityHallucination (3 stars)

Tweets

https://twitter.com/_reachsumit/status/1812672061833437387

https://twitter.com/gm8xx8/status/1812673100120531354