Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 47 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 13 tok/s Pro
GPT-5 High 12 tok/s Pro
GPT-4o 64 tok/s Pro
Kimi K2 160 tok/s Pro
GPT OSS 120B 452 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots (2403.01193v3)

Published 2 Mar 2024 in cs.CL and cs.AI

Abstract: LLMs like ChatGPT demonstrate the remarkable progress of artificial intelligence. However, their tendency to hallucinate -- generate plausible but false information -- poses a significant challenge. This issue is critical, as seen in recent court cases where ChatGPT's use led to citations of non-existent legal rulings. This paper explores how Retrieval-Augmented Generation (RAG) can counter hallucinations by integrating external knowledge with prompts. We empirically evaluate RAG against standard LLMs using prompts designed to induce hallucinations. Our results show that RAG increases accuracy in some cases, but can still be misled when prompts directly contradict the model's pre-trained understanding. These findings highlight the complex nature of hallucinations and the need for more robust solutions to ensure LLM reliability in real-world applications. We offer practical recommendations for RAG deployment and discuss implications for the development of more trustworthy LLMs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Artificial hallucinations in chatgpt: implications in scientific writing. Cureus, 15(2), 2023.
  2. Climbing towards nlu: On meaning, form, and understanding in the age of data. In Proceedings of the 58th annual meeting of the association for computational linguistics, pages 5185–5198, 2020.
  3. Ali Borji. A categorical archive of chatgpt failures. arXiv preprint arXiv:2302.03494, 2023.
  4. Benchmarking large language models in retrieval-augmented generation. arXiv preprint arXiv:2309.01431, 2023.
  5. Trapping llm hallucinations using tagged context prompts. arXiv preprint arXiv:2306.06085, 2023.
  6. The impact of chatgpt on human data collection: A case study involving typicality norming data. Behavior Research Methods, pages 1–8, 2023.
  7. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474, 2020.
  8. Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. arXiv preprint arXiv:2303.08896, 2023.
  9. Sources of hallucination by large language models on inference tasks. arXiv preprint arXiv:2305.14552, 2023.
  10. Large language models are strong zero-shot retriever, 2023.
  11. Herbert A Simon. The sciences of the artificial. MIT press, 1996.
  12. Curriculum vitae: challenges and potential solutions. KOME: An International Journal of Pure Communication Inquiry, 8(2):109–127, 2020.
  13. Prevalence and prevention of large language model use in crowd work, 2023.
  14. On the security of containers: Threat modeling, attack analysis, and mitigation strategies. Computers & Security, 128:103140, 2023.
  15. Exploring AI ethics of ChatGPT: A diagnostic analysis. arXiv preprint arXiv:2301.12867, 2023.
Citations (9)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.