Emergent Mind

Graph-Guided Reasoning for Multi-Hop Question Answering in Large Language Models

(2311.09762)
Published Nov 16, 2023 in cs.CL , cs.AI , and cs.LG

Abstract

Chain-of-Thought (CoT) prompting has boosted the multi-step reasoning capabilities of LLMs by generating a series of rationales before the final answer. We analyze the reasoning paths generated by CoT and find two issues in multi-step reasoning: (i) Generating rationales irrelevant to the question, (ii) Unable to compose subquestions or queries for generating/retrieving all the relevant information. To address them, we propose a graph-guided CoT prompting method, which guides the LLMs to reach the correct answer with graph representation/verification steps. Specifically, we first leverage LLMs to construct a "question/rationale graph" by using knowledge extraction prompting given the initial question and the rationales generated in the previous steps. Then, the graph verification step diagnoses the current rationale triplet by comparing it with the existing question/rationale graph to filter out irrelevant rationales and generate follow-up questions to obtain relevant information. Additionally, we generate CoT paths that exclude the extracted graph information to represent the context information missed from the graph extraction. Our graph-guided reasoning method shows superior performance compared to previous CoT prompting and the variants on multi-hop question answering benchmark datasets.

Graph-guided prompting method includes question graph extraction, intermediate question generation, answer generation, and rationale verification.

Overview

  • The paper 'Graph-Guided Reasoning for Multi-Hop Question Answering in LLMs' introduces a method to improve reasoning capabilities of LLMs by addressing deficiencies in Chain-of-Thought (CoT) prompting approaches.

  • The proposed method involves constructing a question graph, generating and verifying subquestions and their rationales, and creating contextual CoT paths to ensure relevant information is used in reasoning.

  • Evaluated on benchmark datasets, the graph-guided reasoning method consistently outperforms existing CoT approaches, highlighting its potential for both practical applications and theoretical advancements in AI.

Graph-Guided Reasoning for Multi-Hop Question Answering in LLMs

The paper "Graph-Guided Reasoning for Multi-Hop Question Answering in LLMs," authored by Jinyoung Park, Ameen Patel, Omar Zia Khan, Hyunwoo J. Kim, and Joo-Kyung Kim, presents a methodical approach to enhancing the reasoning capabilities of LLMs in multi-hop question answering (QA). The authors identify and address the deficiencies of existing Chain-of-Thought (CoT) prompting approaches, which include generating irrelevant rationales and failing to compose necessary subquestions for retrieving pertinent information.

Introduction

LLMs have demonstrated significant proficiency in varied natural language processing tasks by scaling up the model size. Nonetheless, complex reasoning tasks, such as arithmetic, commonsense, and multi-hop QA, continue to pose challenges. Traditional CoT prompting methods have improved reasoning by generating intermediate rationales but still struggle with issues like irrelevant rationale generation and hallucination.

Motivation

The paper identifies two critical problems with existing CoT approaches:

  1. Generation of rationales that are irrelevant to the posed question.
  2. Inability to effectively compose or query subquestions to gather relevant information.

These limitations impede the model's ability to accurately reason through multiple steps required in multi-hop QA tasks.

Proposed Method

To mitigate these issues, the authors propose a graph-guided CoT prompting method. The key steps in their approach are:

  1. Question Graph Construction: Using LLM prompting, a question graph is constructed by extracting triplets from the initial question. This graph represents relationships and serves as a foundation for guided reasoning.
  2. Subquestion Generation: Based on the question graph, multiple subquestions are generated. These subquestions help in decomposing the original complex question into simpler, more manageable parts.
  3. Rationale Generation: For each subquestion, the model generates intermediate rationales. This process ensures that each step of reasoning is backed by relevant information.
  4. Rationale Verification: Generated rationales are compared against the question graph. If a rationale is deemed irrelevant, it is filtered out. Moreover, follow-up questions are posed to gather any missing relevant information.
  5. Contextual CoT Paths: Conventional CoT paths are generated excluding the entities mentioned in the question graph to capture context information potentially missed during graph extraction.

Results and Evaluation

The authors evaluate their method on three multi-hop QA benchmark datasets: 2WikiMultihopQA, MuSiQue, and Bamboogle. They conduct experiments using Llama-2 models of varying sizes (13B and 70B). The proposed graph-guided reasoning approach consistently outperforms existing CoT prompting methods across all datasets and model sizes.

Numerical Performance

  • For 2WikiMultihopQA, the graph-guided reasoning method achieves 39.2% EM (Exact Match) and 46.87% F1 score, compared to 37.6% EM and 44.04% F1 for the best baseline (Self-Consistency) using Llama-2-70B.
  • In the open-book setting, the proposed method scores an impressive 54.2% EM and 63.97% F1 on 2WikiMultihopQA.

Implications

The introduction of graph-guided CoT prompting addresses key limitations of traditional methods, notably through its structured approach to generating and verifying rationales. The implications of this research are significant for both practical applications and theoretical advancements in AI:

  • Practical: Enhanced performance in multi-hop QA tasks can improve AI applications requiring complex decision-making and reasoning, such as customer service automation and advanced tutoring systems.
  • Theoretical: The integration of graph structures in CoT prompting paves the way for more sophisticated hybrid models combining symbolic reasoning with deep learning.

Future Directions

Future work could explore further refinement of graph extraction techniques and better integration with retrieval-augmented generation methods. Additionally, expanding the approach to other types of questions and reasoning tasks could demonstrate the broader applicability of the method.

In summary, this paper introduces a systematic and effective approach to enhancing LLMs' reasoning capabilities in multi-hop QA tasks by leveraging graph-based knowledge representation and verification, setting a new benchmark for future research in this domain.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.