Emergent Mind

Understanding Finetuning for Factual Knowledge Extraction

(2406.14785)
Published Jun 20, 2024 in cs.CL and cs.LG

Abstract

In this work, we study the impact of QA fine-tuning data on downstream factuality. We show that fine-tuning on lesser-known facts that are poorly stored during pretraining yields significantly worse factuality than fine-tuning on well-known facts, even when all facts are seen during pretraining. We prove this phenomenon theoretically, showing that training on lesser-known facts can lead the model to ignore subject entity names and instead output a generic plausible response even when the relevant factual knowledge is encoded in the model. On three question answering benchmarks (PopQA, Entity Questions, and MMLU) and two language models (Llama-2-7B and Mistral-7B), we find that (i) finetuning on a completely factual but lesser-known subset of the data deteriorates downstream factuality (5-10%) and (ii) finetuning on a subset of better-known examples matches or outperforms finetuning on the entire dataset. Ultimately, our results shed light on the interaction between pretrained knowledge and finetuning data and demonstrate the importance of taking into account how facts are stored in the pretrained model when fine-tuning for knowledge-intensive tasks.

Overview

  • The paper investigates the impact of fine-tuning on factual knowledge extraction in LLMs, focusing on question-answering (QA) tasks.

  • The authors find that fine-tuning on lesser-known facts deteriorates factual accuracy, while focusing on well-known facts improves overall factuality.

  • The theoretical concept of factual salience is introduced, highlighting how attention imbalances during fine-tuning affect the model's reliance on factual details.

Analyzing the Impact of QA Fine-tuning on Factual Knowledge Extraction

The paper "Understanding Fine-tuning for Factual Knowledge Extraction" by Gaurav Ghosal, Tatsunori Hashimoto, and Aditi Raghunathan offers a comprehensive analysis of how fine-tuning on factual knowledge affects downstream factuality in LLMs. This study rigorously examines the nuances of question-answering (QA) fine-tuning and its impact on factuality, using both empirical experiments and theoretical analyses.

Key Findings

The central theme of this paper is the disparity in factual accuracy depending on the nature of the fine-tuned data. The authors show that fine-tuning on lesser-known facts results in significantly worse factuality compared to fine-tuning on well-known facts. This holds true even when all facts considered during fine-tuning were seen during pretraining. The results are consistent across multiple benchmarks (PopQA, Entity Questions, and MMLU) and two LLMs (Llama-2-7B, Mistral-7B).

Key empirical observations include:

  1. Fine-tuning on a dataset of lesser-known facts deteriorates the factuality by approximately 5-10%.
  2. Fine-tuning on a subset of better-known facts can match or outperform fine-tuning on the entire dataset.

Theoretical Contributions

Theoretically, the authors introduce the concept of factual salience, which measures how well a fact is stored in a pretrained model. The paper proves that training on lesser-known facts can lead the model to rely on generic plausible responses rather than specific factual details. This phenomenon is attributed to an imbalance in attention scores during the fine-tuning process, where less salient facts exacerbate this imbalance.

Implications

  1. Practical Implications: The findings suggest a shift in data curation strategies for QA fine-tuning. A key takeaway is that focusing on a smaller number of well-known facts could be sufficient to improve factuality, even for less-known facts. This could optimize resource allocation in training large-scale models by reducing the need for extensive datasets.
  2. Theoretical Implications: The introduction and formalization of factual salience offer a foundational concept that could drive future research in understanding how LLMs store and retrieve factual information. This connects the practical aspects of LLM fine-tuning with deeper theoretical insights.

Empirical Analysis

The authors conduct extensive synthetic experiments to simulate the practical implications of their theoretical findings. By creating controlled datasets that vary in the popularity of facts, they effectively isolate the variable of interest. The results show that fine-tuning on more popular facts markedly improves downstream factuality, even on less popular test examples.

The study also explores the dynamics of attention in language models. Analysis shows that attention to subject tokens decreases significantly in models fine-tuned on less well-known facts, further corroborating the theoretical findings about attention imbalance.

Future Directions

Given the robust framework and findings, several future research directions can be identified:

  • Developing fine-tuning techniques that explicitly mitigate attention imbalance.
  • Exploring curriculum learning approaches where models are first fine-tuned on well-known facts before integrating less-known facts.
  • Extending the concept of factual salience to other domains and tasks beyond QA, such as summarization or dialogue systems.
  • Investigating the potential of synthetic data generation to improve the efficiency of fine-tuning processes.

Conclusion

This paper lays a vital groundwork for understanding the intricate dynamics of QA fine-tuning on factual accuracy in LLMs. The consistent and significant findings across different datasets and models provide strong evidence for revisiting how fine-tuning datasets are constructed. By integrating theoretical insights with empirical validations, the paper opens new avenues for both improving the performance of LLMs in knowledge-intensive tasks and deepening our understanding of their internal mechanisms.

In sum, the clear delineation of factual salience and its practical implications underscore the importance of strategic data curation in enhancing the factual reliability of AI systems, paving the way for more robust and trustworthy language models.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.