Emergent Mind

WiCE: Real-World Entailment for Claims in Wikipedia

(2303.01432)
Published Mar 2, 2023 in cs.CL

Abstract

Textual entailment models are increasingly applied in settings like fact-checking, presupposition verification in question answering, or summary evaluation. However, these represent a significant domain shift from existing entailment datasets, and models underperform as a result. We propose WiCE, a new fine-grained textual entailment dataset built on natural claim and evidence pairs extracted from Wikipedia. In addition to standard claim-level entailment, WiCE provides entailment judgments over sub-sentence units of the claim, and a minimal subset of evidence sentences that support each subclaim. To support this, we propose an automatic claim decomposition strategy using GPT-3.5 which we show is also effective at improving entailment models' performance on multiple datasets at test time. Finally, we show that real claims in our dataset involve challenging verification and retrieval problems that existing models fail to address.

WiCE shows Wikipedia claim annotations, breaking them into supported, partially supported, and unsupported subclaims.

Overview

  • Introduces the W ICE dataset for realistic textual entailment tasks in NLP, using Wikipedia as a source.

  • Offers annotations at the sub-sentence level indicating which parts of claims are supported by evidence.

  • W ICE includes an automatic claim decomposition tool called Claim-Split, powered by GPT-3.5, for better annotation and model performance.

  • Highlights that current entailment models struggle with complex real-world claims found in W ICE.

  • Demonstrates that context and retrieval are crucial for models to outperform in evidence verification but still lag behind human-level accuracy.

The field of NLP often requires models to verify the truthfulness of statements based on provided evidence, which can have applications ranging from fact-checking to document summarization. A new dataset, named W ICE (Wikipedia Citation Entailment), intends to tackle these challenges by offering a more realistic and fine-grained textual entailment setup.

This dataset is rooted in Wikipedia, where claims within articles are automatically identified and linked with the articles they cite as evidence. W ICE not only assesses whether a claim is supported, partially supported, or unsupported by the evidence but also provides detailed annotations for sub-sentence units within the claims, showing exactly which parts are supported by the evidence and which are not.

One notable innovation introduced alongside W ICE is an automatic claim decomposition strategy known as Claim-Split. Utilizing GPT-3.5, it breaks complex claims into more manageable subclaims, making the annotation process more efficient and possibly improving the performance of entailment models, as subclaims can be easier to evaluate than longer, more intricate statements.

W ICE is shown to pose new challenges for current entailment models that generally deal with shorter texts. Existing models, when assessed on real-world claims from the dataset, underperform due to the complex nature of evidence verification and retrieval issues that these models are not yet equipped to handle.

The importance of context and retrieval is underscored in the data analysis. Models trained to predict entailment using chunks of the evidence, combined with context, achieve better performance than those relying solely on individual sentences. However, these systems still fall short of human-level performance.

In summary, W ICE represents a step forward in the realistic assessment of models' capability to determine the factual correctness of real-world claims. Its supporting tools, like Claim-Split and fine-grained annotations, provide ways to both enhance the dataset and potentially improve model performance, emphasizing the importance of context, retrieval, and the granularity of evidence in the continuous evolution of automated fact verification systems.

Subscribe by Email

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

YouTube