Fine-grained Hallucination Detection and Editing for Language Models (2401.06855v4)
Abstract: LLMs (LMs) are prone to generate factual errors, which are often called hallucinations. In this paper, we introduce a comprehensive taxonomy of hallucinations and argue that hallucinations manifest in diverse forms, each requiring varying degrees of careful assessments to verify factuality. We propose a novel task of automatic fine-grained hallucination detection and construct a new evaluation benchmark, FavaBench, that includes about one thousand fine-grained human judgments on three LM outputs across various domains. Our analysis reveals that ChatGPT and Llama2-Chat (70B, 7B) exhibit diverse types of hallucinations in the majority of their outputs in information-seeking scenarios. We train FAVA, a retrieval-augmented LM by carefully creating synthetic data to detect and correct fine-grained hallucinations. On our benchmark, our automatic and human evaluations show that FAVA significantly outperforms ChatGPT and GPT-4 on fine-grained hallucination detection, and edits suggested by FAVA improve the factuality of LM-generated text.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
- Self-RAG: Learning to retrieve, generate, and critique through self-reflection. arXiv preprint arXiv:2310.11511.
- Correcting diverse factual errors in abstractive summarization via post-editing and language model infilling. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.
- A review on fact extraction and verification. Association for Computing Machinery Computing Surveys (CSUR).
- Language models are few-shot learners. In Advances in Neural Information Processing systems.
- Purr: Efficiently editing language model hallucinations by denoising language model corruptions. arXiv preprint arXiv:2305.14908.
- Factool: Factuality detection in generative ai–a tool augmented framework for multi-task and multi-domain scenarios. arXiv preprint arXiv:2307.13528.
- Evaluating factuality in text simplification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.
- On the origin of hallucinations in conversational models: Is it the datasets or the models? In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
- QAFactEval: Improved QA-based factual consistency evaluation for summarization. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
- RARR: Researching and revising what language models say, using language models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.
- The WebNLG challenge: Generating text from RDF data. In Proceedings of the 10th International Conference on Natural Language Generation.
- Are large pre-trained language models leaking your personal information? In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.
- A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:2311.05232.
- Unsupervised dense information retrieval with contrastive learning. Transactions on Machine Learning Research.
- Survey of hallucination in natural language generation. Association for Computing Machinery Computing Surveys.
- Openassistant conversations–democratizing large language model alignment. arXiv preprint arXiv:2304.07327.
- Natural questions: A benchmark for question answering research. Transactions of the Association for Computational Linguistics.
- Lost in the middle: How language models use long contexts. Transactions of the Association for Computational Linguistics.
- Evaluating verifiability in generative search engines. In Findings of the Association for Computational Linguistics: EMNLP 2023.
- Entity-based knowledge conflicts in question answering. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.
- When not to trust language models: Investigating effectiveness and limitations of parametric and non-parametric memories. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.
- Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.
- FActScore: Fine-grained atomic evaluation of factual precision in long form text generation.
- Automated fact-checking for assisting human fact-checkers. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence Survey Track.
- Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems.
- Understanding factuality in abstractive summarization with FRANK: A benchmark for factuality metrics. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
- The troubling emergence of hallucination in large language models - an extensive definition, quantification, and prescriptive remediations. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.
- Get your vitamin C! robust fact verification with contrastive evidence. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
- James Thorne and Andreas Vlachos. 2018. Automated fact checking: Task formulations, methods and future directions. In Proceedings of the 27th International Conference on Computational Linguistics.
- FEVER: a large-scale dataset for fact extraction and VERification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Fact or fiction: Verifying scientific claims. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.
- William Yang Wang. 2017. “liar, liar pants on fire”: A new benchmark dataset for fake news detection. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.
- How far can camels go? exploring the state of instruction tuning on open resources. In Advances in Neural Information Processing Systems.
- Self-Instruct: Aligning language models with self-generated instructions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.
- Siren’s song in the ai ocean: A survey on hallucination in large language models. arXiv preprint arXiv:2309.01219.