Guiding Language Model Reasoning with Planning Tokens (2310.05707v4)

Published 9 Oct 2023 in cs.CL, cs.AI, and cs.LG

Abstract: LLMs have recently attracted considerable interest for their ability to perform complex reasoning tasks, such as chain-of-thought (CoT) reasoning. However, most of the existing approaches to enhance this ability rely heavily on data-driven methods, while neglecting the structural aspects of the model's reasoning capacity. To encourage a more structural generation of CoT steps, we propose a hierarchical generation scheme: we let the LM generate a planning token at the start of each reasoning step, intuitively serving as a high-level plan of the current step, and add their embeddings to the model parameters. Our approach requires a negligible increase in trainable parameters (0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme. We demonstrate our method's effectiveness by applying it to three different LLMs, showing notable accuracy improvements across three math word problem datasets and one multihop QA dataset with respect to standard fine-tuning baselines.

References (59)

Citations (9)

View on Semantic Scholar

Summary

The paper introduces planning tokens to efficiently guide multi-step reasoning, improving math problem accuracy by 3.3%.
It employs minimal parameter increases (0.001%) through both fine-tuning and parameter-efficient methods to embed high-level solution plans.
Experiments on GSM8K, AQUA, and MATH datasets demonstrate significant performance gains, with potential applications beyond math tasks.

Guiding LLM Math Reasoning with Planning Tokens

The paper "Guiding LLM Math Reasoning with Planning Tokens" introduces a novel approach for enhancing the mathematical reasoning capabilities of LLMs through the use of planning tokens. The authors address a notable limitation in existing LLMs: despite their proficiency in managing discrete reasoning steps, these models often exhibit inconsistency across an entire reasoning chain. This inconsistency undermines the ability of LLMs to perform complex reasoning tasks reliably.

Methodology and Approach

The proposed solution involves the introduction of planning tokens at the beginning of each reasoning step, which are embedded into the model parameters. These tokens act as guides that encapsulate high-level solution plans, thus aiding the model in maintaining coherence across multiple reasoning steps. The methodology is highlighted by its minimal increase in trainable parameters, which is only 0.001%, making it a computationally efficient enhancement.

The authors employ both conventional fine-tuning and parameter-efficient schemes to incorporate these planning tokens into various LLMs. The idea is grounded in recent theoretical advances that suggest adding intermediate tokens can enhance the reasoning capacity of transformers. By increasing the length of chain-of-thoughts (CoTs), the models acquire a greater capacity to resolve complex reasoning problems like those found in mathematics.

Experiments and Results

The experimental validation of the approach involves testing on three distinct math word problem datasets: GSM8K, AQUA, and MATH. Three LLMs are evaluated: Phi 1.5, Llama2 (7B), and Llama2 (13B). The introduction of planning tokens led to notable improvements in accuracy across all datasets and models in comparison to standard fine-tuning procedures. Specifically, the inclusion of planning tokens produced an average accuracy rise of 3.3 percentage points.

The results demonstrate that planning tokens enhance the ability of LLMs to solve math problems, with the greatest improvements observed in longer and more complex reasoning chains. The soft Q-VAE approach to inferring planning tokens consistently outperformed these more basic heuristic approaches, showcasing the advantage of learned planning specialization.

Implications and Future Work

The findings have practical implications for developing LLMs that are more robust and reliable in handling structured reasoning tasks, beyond simple information synthesis. The introduction of planning tokens could extend beyond mathematical reasoning tasks, potentially enhancing LLM performance in a variety of domains requiring coherent multi-step logic.

Future research may explore variants of planning tokens across different problem types or further develop the latent inference approach to obtain more expressive and accurate planning tokens. Additionally, exploring interpretability in the use of planning tokens might provide insights into the internal reasoning strategies of LLMs, facilitating a better understanding of how these models can mimic human-like reasoning processes.

In summary, the integration of planning tokens into LLMs presents a simple yet effective means of enhancing their reasoning capacities, with minimal computational overhead. This methodological shift guides the advancement of more cohesively thinking models, opening doors to broader applications of AI in logic-intensive domains.

PDF Markdown

Tweets

https://twitter.com/XinyiWang98/status/1755733057040982252

https://twitter.com/papers_anon/status/1834451339868262424

https://twitter.com/knishimae0531/status/1755792109461004623