Tree of Thoughts: Deliberate Problem Solving with Large Language Models (2305.10601v2)

Published 17 May 2023 in cs.CL, cs.AI, and cs.LG

Abstract: LLMs are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for LLM inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting LLMs, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. Our experiments show that ToT significantly enhances LLMs' problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. Code repo with all prompts: https://github.com/princeton-nlp/tree-of-thought-LLM.

References (44)

Citations (1,229)

View on Semantic Scholar

Summary

The paper introduces a novel Tree of Thoughts framework that enhances LLM problem-solving by exploring multiple reasoning paths.
It leverages thought decomposition, generation, and state evaluation using classical search algorithms like BFS and DFS to improve task outcomes.
Empirical results in games, creative writing, and mini crosswords demonstrate notable performance gains over traditional chain-of-thought methods.

Tree of Thoughts: Deliberate Problem Solving with LLMs

Introduction

The paper "Tree of Thoughts: Deliberate Problem Solving with LLMs" introduces a novel framework to address limitations in current LLMs during problem-solving tasks that require exploration, strategic lookahead, or adaptive decision-making. The proposed framework, Tree of Thoughts (ToT), extends the traditional autoregressive models by employing a search tree that allows for maintaining multiple reasoning paths. The authors draw inspiration from human cognitive models, specifically the dual process theory, to introduce a method that blends fast, automated reasoning with deliberate, strategic search.

Framework Details

The ToT framework generalizes the Chain of Thought (CoT) approach, structuring problem-solving as a search over a tree where nodes represent intermediate steps or "thoughts." This framework leverages classical AI search algorithms such as BFS and DFS, optimizing decision-making by evaluating and backtracking when necessary.

Figure 1: Schematic illustrating various approaches to problem solving with LLMs. Each rectangle box represents a thought, which is a coherent language sequence that serves as an intermediate step toward problem solving.

Key Components:

Thought Decomposition: Task-specific decomposition of problems into coherent thought steps allows for diversity and flexibility in problem-solving.
Thought Generation: Leveraging LLMs to generate multiple plausible continuations for a given state, either through i.i.d. sampling or sequential proposals.
State Evaluation: Utilizing LLMs for heuristic evaluation of states by assigning value or voting, to inform search processes critically.
Search Algorithm: Integrating BFS or DFS in the ToT framework facilitates efficient exploration and ensures adaptability across different tasks.

Experiments

Game of 24, Creative Writing, and Mini Crosswords are used to empirically validate the efficacy of ToT over traditional CoT and IO methods.

Game of 24

A mathematical challenge demanding strategic equation formulation to achieve a target number exposed the ToT's capacity in problem-solving, drastically outperforming CoT prompting by securing a significant improvement in success rate to 74%.

Figure 2: ToT in a game of 24. The LM is prompted for (a) thought generation and (b) valuation.

Creative Writing

This task required generating coherent passages from random sentences. ToT successfully used strategic planning and self-assessment to produce more coherent narratives, as confirmed by both automated and human evaluations.

Figure 3: A step of deliberate search in a randomly picked Creative Writing task. Given the input, the LM samples 5 different plans, then votes 5 times to decide which plan is best.

Mini Crosswords

ToT's DFS approach allowed backtracking and strategic thought censorship, achieving a word-level success rate of 60%, indicating its strength in tasks requiring systematic exploration through combinatorial problem spaces.

Figure 4: In Mini Crosswords, (a) how thoughts are proposed and aggregated in a priority queue for DFS, and (b) how a state is evaluated based on the possibility of filling in each remaining word clue, and pruned if any remaining clue is deemed not possible to fill by the LM.

Implications and Future Directions

The ToT framework's integration of classical AI problem-solving methods with modern LLMs potentially broadens their applicability to complex decision-making and real-world applications. While demonstrating advanced effectiveness on specific difficult tasks, the framework encourages future research in enhancing LLMs' deliberative capabilities, optimizing computational resource utilization, and even fine-tuning models for thought-specific processes. The modular nature of ToT also suggests potential extensions involving hybrid models that fuse different AI paradigms.

Conclusion

The Tree of Thoughts framework significantly extends the problem-solving capabilities of LLMs by incorporating deliberate reasoning and strategic exploration. As AI systems find increasing deployment in environments demanding complex reasoning and decision-making, frameworks like ToT are poised to play a crucial role in advancing the field.