Controlled Text Generation via Language Model Arithmetic (2311.14479v2)

Published 24 Nov 2023 in cs.CL

Abstract: As LLMs are deployed more widely, customization with respect to vocabulary, style, and character becomes more important. In this work, we introduce model arithmetic, a novel inference framework for composing and biasing LLMs without the need for model (re)training or highly specific datasets. In addition, the framework allows for more precise control of generated text than direct prompting and prior controlled text generation (CTG) techniques. Using model arithmetic, we can express prior CTG techniques as simple formulas and naturally extend them to new and more effective formulations. Further, we show that speculative sampling, a technique for efficient LLM sampling, extends to our setting. This enables highly efficient text generation with multiple composed models with only marginal overhead over a single model. Our empirical evaluation demonstrates that model arithmetic allows fine-grained control of generated text while outperforming state-of-the-art on the task of toxicity reduction. We release an open source easy-to-use implementation of our framework at https://github.com/eth-sri/language-model-arithmetic.

References (42)

Citations (24)

View on Semantic Scholar

Summary

The paper presents model arithmetic, a novel framework that enables precise control over text generation without the need for retraining.
It integrates multiple attribute-specific models using mathematical formulas to effectively merge control signals during inference.
A generalized speculative sampling technique reduces computational overhead, lowering model calls by up to 64% while maintaining fluency.

Abstract

The paper introduces a novel framework termed "model arithmetic" for tailoring LLMs to generate text with specific characteristics such as vocabulary, style, or tone, without the need for retraining or specialized datasets. Model arithmetic allows a combination of multiple models and attributes into a single framework using mathematical formulations to modify the LLM's token distribution during inference. This method can also subsume prior controlled text generation (CTG) techniques by expressing them as simple formulas. An open source implementation of this framework is made available.

Introduction

LLM customization is essential for diverse applications involving different audience groups. The traditional techniques—prompting and fine-tuning—are either limited in precise control or require extensive data and resources. Model arithmetic overcomes these issues by providing an intuitive method to merge multiple LLMs and attribute models, creating composite models that precisely govern the text output's attributes. This methodology is orthogonal to prior CTG techniques, encompassing them within its formula-based system, thereby broadening the scope and precision of CTG.

Fine-Grained Control via Model Arithmetic

Model arithmetic presents a systematic way to finely control generated text by combining different models reflective of various attributes. The framework's flexibility is highlighted using an example that illustrates the process of assembling models each responsible for particular attributes such as "child," "adult," and "magic," as well as incorporating a classifier for "formality." This method enables the creation of output that is precisely controlled by the influence of each attribute or component, surpassing the capabilities of direct prompting and fine-tuning.

Efficient Model Arithmetic via Generalized Speculative Sampling

One of the challenges with CTG is the increase in inference times due to the necessity of evaluating multiple models. Model arithmetic alleviates this through generalized speculative sampling, an extension of an existing technique designed to lower latency. This enhanced speculative sampling postpones the computation of more expensive model calls within the arithmetic formulas, thus optimizing performance. The result is an efficient execution of model actions with only marginal overhead, even when employing multiple models.

Evaluation

Empirical evaluations demonstrate that model arithmetic can produce expressive content with controlled attributes more effectively than existing CTG methods, particularly in the context of reducing toxicity in generated text. Additionally, the use of speculative sampling within model arithmetic results in significant computational efficiency, reducing model calls by up to 64%. The framework's proficiency in nuanced control without a decline in fluency is further evident in its quantitative analysis across various tasks and comparative analysis against state-of-the-art methodologies.

PDF Markdown

GitHub

GitHub - eth-sri/language-model-arithmetic: Controlled Text Generation via Language Model Arithmetic (222 stars)

Tweets

https://twitter.com/the_sri_lab/status/1787723545927106803

https://twitter.com/marc_r_fischer/status/1748034121383719132