Emergent Mind

EX-FEVER: A Dataset for Multi-hop Explainable Fact Verification

Published Oct 15, 2023 in cs.AI


Fact verification aims to automatically probe the veracity of a claim based on several pieces of evidence. Existing works are always engaging in accuracy improvement, let alone explainability, a critical capability of fact verification systems. Constructing an explainable fact verification system in a complex multi-hop scenario is consistently impeded by the absence of a relevant, high-quality dataset. Previous datasets either suffer from excessive simplification or fail to incorporate essential considerations for explainability. To address this, we present EXFEVER, a pioneering dataset for multi-hop explainable fact verification. With over 60,000 claims involving 2-hop and 3-hop reasoning, each is created by summarizing and modifying information from hyperlinked Wikipedia documents. Each instance is accompanied by a veracity label and an explanation that outlines the reasoning path supporting the veracity classification. Additionally, we demonstrate a novel baseline system on our EX-FEVER dataset, showcasing document retrieval, explanation generation, and claim verification, and validate the significance of our dataset. Furthermore, we highlight the potential of utilizing LLMs in the fact verification task. We hope our dataset could make a significant contribution by providing ample opportunities to explore the integration of natural language explanations in the domain of fact verification.

Sample from EX-FEVER dataset showing textual explanation color-coded to correspond with different documents.


  • The EX-FEVER dataset introduces over 60,000 claims requiring 2-hop and 3-hop reasoning for multi-hop explainable fact verification, complete with veracity labels and explanations.

  • It underscores the importance of high-quality, varied examples for complex fact verification and uses a novel baseline system to demonstrate LLMs' potential in the field.

  • Challenges in document retrieval and integrating explanations into the verification process are highlighted, alongside the limitations of current fact-checking models.

  • The dataset encourages future research by providing a foundation that challenges current methodologies and demonstrates LLMs' roles in augmenting human fact-checking efforts.

EX-FEVER: Pioneering the Way in Multi-hop Explainable Fact Verification


With the proliferation of digital information, the necessity for reliable fact verification systems has become increasingly evident. The EX-FEVER dataset emerges as a response to the critical need for high-quality data to facilitate research in multi-hop explainable fact verification. This dataset introduces over 60,000 claims necessitating 2-hop and 3-hop reasoning, each with a designated veracity label and an explanation delineating the reasoning path. Through the development of a novel baseline system and the demonstration of LLMs' potential in fact verification, EX-FEVER sets the stage for significant advancements in the domain.

Dataset Overview

EX-FEVER differentiates itself by focusing on multi-hop reasoning with a strong emphasis on explainability. The dataset includes claims generated by summarizing and modifying information from hyperlinked Wikipedia documents, each accompanied by a veracity label (SUPPORTS, REFUTES, NOT ENOUGH INFO) and a detailed explanation. These explanations are pivotal, providing insights into the reasoning behind the veracity classification. The meticulous construction of EX-FEVER involved crowd workers, ensuring high-quality and varied examples that mirror the complexity and nuances of real-world data.

Baseline System Evaluation

The baseline system, composed of document retrieval, explanation generation, and claim verification stages, serves as a testament to the robustness and applicability of the EX-FEVER dataset. The performance of the system underscores the challenges in multi-hop fact verification, especially in document retrieval and the integration of explanations into the verification process.

Notably, the examination revealed a bottleneck in document retrieval, emphasizing the significance of effective multi-hop design retrieval models. Furthermore, the analysis of verdict prediction highlighted the limitations of existing fact-checking models, advocating for more sophisticated approaches to accommodate the intricacies of multi-hop reasoning.

LLMs in Fact Verification

A compelling aspect of the paper is the exploration of LLMs for fact verification. The investigation unveils LLMs' proficiency as planners in generating explanations, rather than directly making predictions. This nuanced finding points to the future of fact verification, where LLMs may play a pivotal role in augmenting human efforts through efficient program guides, thereby enhancing both the efficiency and reliability of fact-checking systems.

Final Thoughts

EX-FEVER represents a significant stride forward in the quest for advanced multi-hop explainable fact verification systems. By offering a comprehensive dataset that challenges current methodologies and highlights the potential of LLMs, this work paves the way for future research endeavors. It invites a reevaluation of existing approaches and fuels the development of innovative solutions that can tackle the complexities of multi-hop reasoning and explainability in fact verification. As the landscape of digital information continues to evolve, the contributions of EX-FEVER will undoubtedly influence the trajectory of fact-checking research, steering it towards more accountable, transparent, and reliable systems.

Create an account to read this summary for free:


Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.