DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Published 7 Sep 2023 in cs.CL, cs.AI, and cs.LG | (2309.03883v2)

Abstract: Despite their impressive capabilities, LLMs are prone to hallucinations, i.e., generating content that deviates from facts seen during pretraining. We propose a simple decoding strategy for reducing hallucinations with pretrained LLMs that does not require conditioning on retrieved external knowledge nor additional fine-tuning. Our approach obtains the next-token distribution by contrasting the differences in logits obtained from projecting the later layers versus earlier layers to the vocabulary space, exploiting the fact that factual knowledge in an LLMs has generally been shown to be localized to particular transformer layers. We find that this Decoding by Contrasting Layers (DoLa) approach is able to better surface factual knowledge and reduce the generation of incorrect facts. DoLa consistently improves the truthfulness across multiple choices tasks and open-ended generation tasks, for example improving the performance of LLaMA family models on TruthfulQA by 12-17% absolute points, demonstrating its potential in making LLMs reliably generate truthful facts.

Abstract PDF Upgrade to Chat

Citations (105)

View on Semantic Scholar

Summary

The paper introduces DoLa, a novel decoding method that contrasts layers within LLMs to reduce factual hallucinations.
It leverages differences between mature and premature layers using Jensen-Shannon Divergence to dynamically select factual outputs.
DoLa significantly improves factual accuracy on several benchmarks with minimal latency, advancing reliable deployment in critical applications.

Analysis of "DoLa: Decoding by Contrasting Layers Improves Factuality in LLMs"

The paper "DoLa: Decoding by Contrasting Layers Improves Factuality in LLMs" presents a novel method focusing on addressing the prevalent issue of hallucinations in LLMs. This phenomenon, where LLMs generate content deviating from trained knowledge, poses a significant challenge, particularly in high-stakes applications like legal and clinical settings. The authors propose a simple yet effective decoding strategy called Decoding by Contrasting Layers (DoLa) to enhance factual accuracy without the need for additional fine-tuning or reliance on external data.

Methodology

LLMs, structured with layers that encode varying levels of syntactic and semantic information, present an opportunity to address hallucinations by leveraging their internal structures. Past research indicates distinctive roles for different layers: earlier layers tend to encode syntactic knowledge while later layers encapsulate semantic and factual information. The core of DoLa lies in dynamically selecting layers for decoding, distinguishing between the logits of a 'mature' layer (typically later in the model) and a 'premature' layer (chosen dynamically from earlier layers). By contrasting these, DoLa emphasizes the factual content while reducing reliance on less robust syntactic predictions.

The method builds on existing early-exit strategies in transformer models, using Jensen-Shannon Divergence to dynamically assess the layer most divergent from the final layer in each decoding step. This allows the model to adjust its output by enhancing it with the factual richness of higher layers, providing a tailored response based on the complexity of the token being predicted.

Experimental Evaluation

The authors executed comprehensive experiments on established benchmarks including TruthfulQA, StrategyQA, FACTOR, and open-ended Vicuna QA tasks. The results consistently demonstrate a marked improvement in the factual accuracy of LLaMA models across all tested sizes (7B to 65B) when employing DoLa. Particularly notable is the 12-17% absolute improvement in factual accuracy on TruthfulQA, surpassing traditional decoding approaches such as Contrastive Decoding and Inference Time Intervention.

For chain-of-thought tasks like StrategyQA and GSM8K, DoLa's layer selection strategy efficiently utilizes early-layer contrasting, resulting in better performance than CD, which often suffers due to the inappropriate selection of amateur models in multi-layer contrasting scenarios.

Furthermore, the method is efficient with negligible latency overhead, which is critical for real-time applications. The increase in decoding latency remains minimal, maintaining practicality in deploying DoLa in operational settings.

Implications and Future Directions

The implications of this research are substantial for the field of AI and NLP. By leveraging intrinsic architectural features of LLMs without further training, DoLa offers a streamlined solution reducing factual inaccuracies—significantly boosting the reliability of LLMs in practical deployments. Importantly, this method aligns with current computational constraints, ensuring feasible application in environments where efficiency and accuracy are paramount.

Future research could explore integrating DoLa with reinforcement learning strategies or retrieval-augmented generation to further harness external factual databases, potentially addressing hallucination issues originating from training data biases. Additionally, extending this methodology to non-transformer architectures could open new avenues for enhancing model factuality across the board.

Conclusion

In conclusion, DoLa represents a significant advancement in the journey toward reliable LLM deployment. Its ability to capitalize on internal model features to improve factual outputs efficiently sets a new benchmark in model interpretability and application in data-sensitive fields. This work not only challenges but also enriches the approach to understanding and mitigating hallucinations in LLMs, underscoring a vital step forward in harnessing AI for complex real-world applications.

Markdown