How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models (2407.00369v1)

Published 29 Jun 2024 in cs.CL

Abstract: Given the growing influx of misinformation across news and social media, there is a critical need for systems that can provide effective real-time verification of news claims. Large language or multimodal model based verification has been proposed to scale up online policing mechanisms for mitigating spread of false and harmful content. While these can potentially reduce burden on human fact-checkers, such efforts may be hampered by foundation model training data becoming outdated. In this work, we test the limits of improving foundation model performance without continual updating through an initial study of knowledge transfer using either existing intra- and inter- domain benchmarks or explanations generated from LLMs. We evaluate on 12 public benchmarks for fact-checking and misinformation detection as well as two other tasks relevant to content moderation -- toxicity and stance detection. Our results on two recent multi-modal fact-checking benchmarks, Mocheg and Fakeddit, indicate that knowledge transfer strategies can improve Fakeddit performance over the state-of-the-art by up to 1.7% and Mocheg performance by up to 2.9%.

Citations (2)

View on Semantic Scholar

Summary

The paper introduces a novel fact verification framework using a multimodal CLIP architecture enhanced by LLM-generated explanations.
It demonstrates performance improvements of up to 2.9% F1 on benchmarks like Fakeddit and Mocheg, proving the effectiveness of its approach.
The study highlights the importance of dataset diversity and knowledge transfer in overcoming domain shifts and improving verification robustness.

How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models

The paper "How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models" addresses the exigent need for effective real-time verification systems to combat misinformation proliferating through news and social media. The paper aims to scrutinize the limitations of enhancing the performance of foundation models without the necessity for continual updating, focusing on fact verification using knowledge transfer approaches within intra- and inter-domain benchmarks. Additionally, the paper examines the potential benefits of using explanations generated by LLMs for training fact verification systems.

Methodology and Experimentation

The authors evaluate their methodology using twelve public benchmarks encompassing fact-checking, misinformation detection, as well as toxicity and stance detection tasks. They investigated the performance of their models on Fakeddit and Mocheg, two recent multimodal fact-checking benchmarks. The results show performance improvements over the state-of-the-art by up to 1.7% and 2.9% on Fakeddit and Mocheg, respectively, indicating the efficacy of the knowledge transfer strategies proposed.

Key components of their approach include:

Base Verification Architecture: The models utilize a vision-and-language classification model (CLIP) to encode textual and visual evidence. This is then processed through a classifier to predict claim veracity.
Dataset Mixtures: The paper explores various dataset mixtures for both intra- and inter-domain learning. The findings highlight the importance of data diversity over purely dataset scale, demonstrating significant brittleness to domain shift.
LLM Explanations: The authors augment training data with explanations generated using GPT-3.5-turbo and GPT-4o, demonstrating that these explanations can improve model performance, highlighted by a 10.89% improvement in the Mocheg F1 score with GPT-4o.

Numerical Results

The models exhibit significant improvements:

The best intra-domain mixture for Fakeddit achieved a performance of 93.42 F1, up by 1.7% F1.
Knowledge distillation through GPT-4o enhanced Mocheg performance by 2.9% F1, achieving a state-of-the-art result of 65.07 F1 for an open model.
Beyond fact-checking, knowledge transfer from fact-checking data to hate speech detection showed improvement by 13.65%.

Implications and Future Work

The practical and theoretical implications of this research are multifaceted:

Practical: The integration of LLM-generated explanations into the training pipeline can substantially reduce the burden on human fact-checkers and enhance the reliability of automated systems. This is particularly crucial in rapidly evolving domains like political news and global events.
Theoretical: The paper underscores the synergy between diverse datasets in improving generalization and robustness in fact verification. It also presents a compelling case for the continuous integration of commonsense and domain-specific knowledge through LLMs to counter misinformation.

Future Directions

The high sensitivity to domain shift and the need for robust evaluation on unseen data are notable challenges. Future research could focus on:

Expansion of dataset diversity to include more topical and temporally varied data, enhancing model robustness to covariate shifts.
Integration of visual information in the generation of explanations to further improve the multimodal reasoning capabilities of verification models.
Exploration of more sophisticated ensemble methods for combining intra- and inter-domain data.

Conclusion

The paper's comprehensive analysis and innovative approaches provide valuable insights for developing more resilient and accurate fact verification systems. By leveraging the capabilities of state-of-the-art LLMs and emphasizing knowledge transfer, the authors contribute significantly to the advancement of content moderation technology. This research marks a step towards more reliable and scalable automated tools for combating misinformation, calling for continued exploration of transfer learning and explanation-driven model training.

PDF Markdown

Related Papers

Tweets

https://twitter.com/lee__jaeyoung/status/1811906003530842490