Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with Language Models (2306.06891v1)

Published 12 Jun 2023 in cs.CL and cs.AI

Abstract: Generating intermediate steps, or Chain of Thought (CoT), is an effective way to significantly improve LLMs' (LM) multi-step reasoning capability. However, the CoT lengths can grow rapidly with the problem complexity, easily exceeding the maximum context size. Instead of increasing the context limit, which has already been heavily investigated, we explore an orthogonal direction: making LMs divide a problem into multiple contexts. We propose a new inference framework, called Recursion of Thought (RoT), which introduces several special tokens that the models can output to trigger context-related operations. Extensive experiments with multiple architectures including GPT-3 show that RoT dramatically improves LMs' inference capability to solve problems, whose solution consists of hundreds of thousands of tokens.

References (20)

Citations (21)

View on Semantic Scholar

Summary

The paper introduces Recursion of Thought (RoT), a recursive divide-and-conquer framework that overcomes context limitations in language models.
The methodology uses special tokens (GO, STOP, THINK) to break complex problems into subproblems, significantly boosting performance on arithmetic and algorithmic tasks.
Experimental results show near-perfect accuracy on complex tasks, highlighting RoT's scalability across models from GPT-3 to compact architectures.

Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with LLMs

Introduction

The paper "Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with LLMs" (2306.06891) introduces a novel inference framework named Recursion of Thought (RoT). This approach dramatically enhances the multi-context reasoning capability of LMs, such as GPT-3, by implementing a recursive divide-and-conquer strategy. While previous methodologies like Chain of Thought (CoT) significantly improve reasoning by extending the context of LMs, they are inherently limited by their maximum context sizes. The RoT solves this limitation by allowing models to break complex tasks into smaller, manageable sub-problems, thereby circumventing the constraints posed by context length limitations.

Methodology

The RoT framework is built on three main components: the use of special tokens GO, STOP, and THINK; a recursive problem-solving strategy; and a training regime that supports such recursive processes. Unlike traditional methodologies that extend the context length, RoT enables recursive context generation through these tokens. During inference, the model uses these tokens to demarcate and handle subproblems of a given task. The GO and STOP tokens indicate the beginning and end of a problem or subproblem, respectively. The THINK token, critically, initiates the recursion to tackle a subproblem as a distinct context, facilitating multi-context reasoning by creating multiple smaller contexts.

Training RoT involves supervised learning with ground truth intermediate steps that detail when each special token should be emitted. This structured approach provides the models with a clear framework for problem decomposition and subproblem integration, ensuring that valid pathways of reasoning are learned and applied.

Experimental Results

The experiments conducted confirm that RoT significantly enhances reasoning capabilities over long token sequences — up to hundreds of thousands of tokens — across various tasks, including arithmetic and algorithmic problems. The performance of systems enhanced with RoT was compared to those using traditional CoT and those without thought prompts. Results show that RoT provides near-perfect accuracy for these extremely complex tasks, demonstrating its robustness and scalability in handling contextually large problems.

Moreover, testing was not limited to highly parameterized models like GPT-3 but extended to tiny models like Transformers and LSTMs without pre-training. Even these smaller architectures performed complex reasoning tasks effectively under the RoT framework, indicating the potential for broad applicability across different model sizes.

Implications and Future Work

The implications of RoT are vast. Its ability to allow any LLM to manage complex reasoning tasks without architectural changes is profound. This could play an essential role in future advancements of LMs, especially in environments where task solutions inherently exceed the current context limitations.

However, the current implementation relies heavily on supervised training, which is computationally expensive. Future work could focus on reducing this dependency, possibly through reinforcement learning techniques that maintain performance while minimizing the supervision requirement.

The potential of RoT extends beyond the current set of synthetic tasks designed for this paper, opening avenues for application in natural language tasks that demand long reasoning chains. A critical next step involves curating or developing datasets that naturally require lengthy reasoning beyond existing benchmarks.

Conclusion

Recursion of Thought not only offers a practical solution to the limitations posed by context size in LLMs but also poses a fundamental shift in how complex reasoning tasks can be approached and solved. By leveraging recursion and divide-and-conquer strategies, RoT sets the stage for more efficient and scalable LLMs capable of handling tasks that were previously considered impractical due to their required reasoning scale.