Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 19 tok/s Pro
GPT-5 High 22 tok/s Pro
GPT-4o 74 tok/s Pro
Kimi K2 193 tok/s Pro
GPT OSS 120B 438 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning (2305.12295v2)

Published 20 May 2023 in cs.CL and cs.AI

Abstract: LLMs have shown human-like reasoning abilities but still struggle with complex logical problems. This paper introduces a novel framework, Logic-LM, which integrates LLMs with symbolic solvers to improve logical problem-solving. Our method first utilizes LLMs to translate a natural language problem into a symbolic formulation. Afterward, a deterministic symbolic solver performs inference on the formulated problem. We also introduce a self-refinement module, which utilizes the symbolic solver's error messages to revise symbolic formalizations. We demonstrate Logic-LM's effectiveness on five logical reasoning datasets: ProofWriter, PrOntoQA, FOLIO, LogicalDeduction, and AR-LSAT. On average, Logic-LM achieves a significant performance boost of 39.2% over using LLM alone with standard prompting and 18.4% over LLM with chain-of-thought prompting. Our findings suggest that Logic-LM, by combining LLMs with symbolic logic, offers a promising avenue for faithful logical reasoning. Code and data are publicly available at https://github.com/teacherpeterpan/Logic-LLM.

Citations (171)

Summary

  • The paper introduces Logic-LM, a framework that integrates LLMs with symbolic solvers to enhance logical reasoning.
  • It employs a three-module design for problem formulation, symbolic reasoning, and result interpretation, achieving up to 39.2% improvement over standard prompting on complex tasks.
  • The study demonstrates that self-refinement based on solver feedback increases symbolic accuracy and overall model robustness in diverse logical reasoning scenarios.

Logic-LM: Integrating LLMs with Symbolic Solvers

The paper proposes a framework named Logic-LM, designed to enhance the logical reasoning capabilities of LLMs by integrating symbolic solvers. This approach aims to overcome LLMs' struggles with complex logical problems by leveraging symbolic inference engines to ensure reasoning faithfulness and transparency. The framework includes a novel self-refinement module to ensure accurate symbolic formalizations, using feedback from symbolic solvers to iteratively refine logical representations.

Framework Overview

Logic-LM consists of three key modules: the Problem Formulator, the Symbolic Reasoner, and the Result Interpreter.

  1. Problem Formulator: This module utilizes LLMs to translate natural language problems into symbolic representations. By defining a task-specific grammar for logic programming (LP), first-order logic (FOL), constraint satisfaction problems (CSP), and boolean satisfiability (SAT) formulations, the Problem Formulator creates symbolic statements that serve as inputs for symbolic solvers. Figure 1 *Figure 1: Overview of our Logic-LM framework. *
  2. Symbolic Reasoner: This component uses external deterministic solvers tailored to specific reasoning tasks—LP systems for deductive reasoning, Prover9 for FOL, python-constraint for CSP, and Z3 for SAT—to infer solutions or prove propositions based on the given symbolic input.
  3. Result Interpreter: This module translates solver results back into natural language answers. Depending on the problem's complexity, it employs either rule-based or LLM-based methods to perform this translation. Figure 2

    Figure 2: Overview of our Logic-LM model, which consists of three modules: (1)~Problem Formulator generates a symbolic representation for the input problem with LLMs via in-context learning (2)~Symbolic Reasoner performs logical inference on the formulated problem, and (3)~Result Interpreter interprets the symbolic answer.

Experimental Setup

The paper evaluates Logic-LM's effectiveness across five logical reasoning datasets: ProofWriter, PrOntoQA, FOLIO, LogicalDeduction, and AR-LSAT. These datasets encompass various logical reasoning tasks, including deductive reasoning, first-order logic reasoning, constraint satisfaction problems, and analytical reasoning. Logic-LM's performance is compared against standard prompting and chain-of-thought prompting methods, using models such as ChatGPT and GPT-4 for the underlying LLMs.

Results and Observations

  1. Performance Enhancement: Logic-LM significantly enhances performance over standard LLM prompting and chain-of-thought prompting. On average, it achieves a 39.2% improvement over standard prompting and an 18.4% improvement over chain-of-thought prompting, highlighting the robustness of integrating symbolic solvers.
  2. Reasoning Depth and Robustness: The effectiveness of Logic-LM grows with the complexity of the reasoning task, particularly when the required reasoning depth increases. Its ability to delegate complex reasoning to symbolic solvers ensures faithfulness and robustness that purely language-based methods lack, especially evident on more complex datasets such as FOLIO. Figure 3

    Figure 3: Accuracy of different models for increasing size of reasoning depth on the ProofWriter dataset.

  3. Self-Refinement Impact: Introducing self-refinement improves the accuracy of symbolic formulations by iteratively refining the logic representation based on solver feedback. This process increases executable rates and enhances overall model performance. Figure 4

    Figure 4: The accuracy for different rounds of self-refinement, with the corresponding executable rates.

Case Studies

In examining specific examples, the paper showcases Logic-LM's symbolic generation capabilities and some persistent challenges, such as accurately defining predicates and handling natural language ambiguity. These examples demonstrate areas for improvement and potential further refinements. Figure 5

Figure 5: An example of the generated symbolic representation and the predicted answer by Logic-LM.

Conclusion

Logic-LM represents a promising approach to improving logical reasoning in LLMs by combining the interpretability of symbolic solvers with the robust language understanding capabilities of modern LLMs. Future advancements could include more adaptable logic systems, such as probabilistic soft logic, to address reasoning in contexts of uncertainty and commonsense challenges.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 3 tweets and received 1 like.

Upgrade to Pro to view all of the tweets about this paper: