Multi-hop Reading Comprehension through Question Decomposition and Rescoring (1906.02916v2)

Published 7 Jun 2019 in cs.CL and cs.AI

Abstract: Multi-hop Reading Comprehension (RC) requires reasoning and aggregation across several paragraphs. We propose a system for multi-hop RC that decomposes a compositional question into simpler sub-questions that can be answered by off-the-shelf single-hop RC models. Since annotations for such decomposition are expensive, we recast sub-question generation as a span prediction problem and show that our method, trained using only 400 labeled examples, generates sub-questions that are as effective as human-authored sub-questions. We also introduce a new global rescoring approach that considers each decomposition (i.e. the sub-questions and their answers) to select the best final answer, greatly improving overall performance. Our experiments on HotpotQA show that this approach achieves the state-of-the-art results, while providing explainable evidence for its decision making in the form of sub-questions.

Citations (219)

View on Semantic Scholar

Summary

The paper introduces a novel approach that decomposes multi-hop questions into single-hop sub-questions via span prediction with minimal supervision.
It employs a global rescoring mechanism that evaluates answer paths, outperforming standard models like BERT on benchmarks such as HotpotQA.
DecompRC enhances explainability and robustness by providing transparent reasoning paths and resilient performance in adversarial settings.

Multi-hop Reading Comprehension through Question Decomposition and Rescoring

The paper presents DecompRC, an innovative system designed to tackle the challenges of multi-hop reading comprehension (RC) by decomposing complex questions into simpler, single-hop sub-questions. This approach addresses the inherent difficulty in multi-hop RC, where evidence aggregation is required across multiple paragraphs to derive answers, as opposed to the single-sentence answers typical in single-hop scenarios.

Key Contributions

Question Decomposition as Span Prediction: The core of the DecompRC system lies in the novel recasting of sub-question generation as a span prediction task, utilizing only 400 labeled examples. This span-based method enables efficient decomposition of questions into sub-questions that are nearly as effective as those authored by humans. This methodological shift circumvents the expensive need for large-scale decomposition annotations and highlights the efficacy of leveraging BERT-based encodings for span prediction.
Global Rescoring Mechanism: The paper introduces a unique global rescoring approach, which evaluates each decomposition based on the context of sub-questions and their corresponding answers to select the most probable final answer. This mechanism mitigates the error propagation typical in pipeline models and optimizes the decision-making process across multiple reasoning paths.
Experiments and Results: The system achieves state-of-the-art performance on the HotpotQA benchmark, significantly surpassing existing models. The experimental results underscore the robustness of DecompRC, especially in contexts involving distractor paragraphs and adversarial-inverted questions. Notably, the model demonstrates resilience to data distribution shifts, outperforming end-to-end models like BERT in multi-hop scenarios where single-hop solutions are inadequate.
Explainability and Robustness: An additional advantage of DecompRC is its contribution to model explainability. By leveraging decomposed sub-questions, the system elucidates the reasoning path taken to arrive at a conclusion, a critical aspect of transparency in AI systems. The robustness tested through novel adversarial and modified distractor settings underscores its practical applicability potentially offering insights for future work in adversarial robustness in RC systems.

Implications and Future Directions

The introduction of DecompRC represents a notable advancement in the field of reading comprehension with significant implications for both academic research and applied NLP systems. The model's ability to utilize minimal supervision for effective decomposition and its subsequent performance gains suggest promising directions for efficient, resource-constrained AI deployment. Furthermore, the explainability achieved through sub-question decomposition stands to enhance trust and usability in real-world applications.

Areas ripe for future exploration include extending the decomposition framework to accommodate more complex reasoning types beyond those currently addressed (bridging, intersection, and comparison), and further refining the rescoring mechanism to improve upon identified limitations. Additionally, integrating commonsense reasoning capabilities could bolster the system's capacity to handle implicit multi-hop questions where explicit answers are not readily available.

In summary, DecompRC represents a substantial step forward in multi-hop reading comprehension, achieving superior performance through innovative decomposition and rescoring strategies, while also paving the way for more transparent and explainable AI systems in the future.

PDF Markdown

Multi-hop Reading Comprehension through Question Decomposition and Rescoring (1906.02916v2)

Summary

Multi-hop Reading Comprehension through Question Decomposition and Rescoring

Key Contributions

Implications and Future Directions

Related Papers