Emergent Mind

Controlled Text Generation via Language Model Arithmetic

(2311.14479)
Published Nov 24, 2023 in cs.CL

Abstract

As LLMs are deployed more widely, customization with respect to vocabulary, style, and character becomes more important. In this work, we introduce model arithmetic, a novel inference framework for composing and biasing LLMs without the need for model (re)training or highly specific datasets. In addition, the framework allows for more precise control of generated text than direct prompting and prior controlled text generation (CTG) techniques. Using model arithmetic, we can express prior CTG techniques as simple formulas and naturally extend them to new and more effective formulations. Further, we show that speculative sampling, a technique for efficient LLM sampling, extends to our setting. This enables highly efficient text generation with multiple composed models with only marginal overhead over a single model. Our empirical evaluation demonstrates that model arithmetic allows fine-grained control of generated text while outperforming state-of-the-art on the task of toxicity reduction. We release an open source easy-to-use implementation of our framework at https://github.com/eth-sri/language-model-arithmetic.

Dashed line shows attribute values when prompted in models across various attributes and formulas.

Overview

  • The paper presents a novel 'model arithmetic' framework that enables refined control of text generation in LLMs without retraining.

  • Model arithmetic allows for the combination of language models and attribute models to manipulate attributes such as style, tone, and vocabulary.

  • The framework can encompass and extend existing controlled text generation techniques by translating them into mathematical formulas.

  • Generalized speculative sampling is introduced to enhance computational efficiency during inference with multiple models.

  • Model arithmetic is empirically evaluated, demonstrating its capability to control attributes—like reducing toxicity—and its superiority in performance over existing methods.

Abstract

The paper introduces a novel framework termed "model arithmetic" for tailoring LLMs to generate text with specific characteristics such as vocabulary, style, or tone, without the need for retraining or specialized datasets. Model arithmetic allows a combination of multiple models and attributes into a single framework using mathematical formulations to modify the LLM's token distribution during inference. This method can also subsume prior controlled text generation (CTG) techniques by expressing them as simple formulas. An open source implementation of this framework is made available.

Introduction

LLM customization is essential for diverse applications involving different audience groups. The traditional techniques—prompting and fine-tuning—are either limited in precise control or require extensive data and resources. Model arithmetic overcomes these issues by providing an intuitive method to merge multiple LLMs and attribute models, creating composite models that precisely govern the text output's attributes. This methodology is orthogonal to prior CTG techniques, encompassing them within its formula-based system, thereby broadening the scope and precision of CTG.

Fine-Grained Control via Model Arithmetic

Model arithmetic presents a systematic way to finely control generated text by combining different models reflective of various attributes. The framework's flexibility is highlighted using an example that illustrates the process of assembling models each responsible for particular attributes such as "child," "adult," and "magic," as well as incorporating a classifier for "formality." This method enables the creation of output that is precisely controlled by the influence of each attribute or component, surpassing the capabilities of direct prompting and fine-tuning.

Efficient Model Arithmetic via Generalized Speculative Sampling

One of the challenges with CTG is the increase in inference times due to the necessity of evaluating multiple models. Model arithmetic alleviates this through generalized speculative sampling, an extension of an existing technique designed to lower latency. This enhanced speculative sampling postpones the computation of more expensive model calls within the arithmetic formulas, thus optimizing performance. The result is an efficient execution of model actions with only marginal overhead, even when employing multiple models.

Evaluation

Empirical evaluations demonstrate that model arithmetic can produce expressive content with controlled attributes more effectively than existing CTG methods, particularly in the context of reducing toxicity in generated text. Additionally, the use of speculative sampling within model arithmetic results in significant computational efficiency, reducing model calls by up to 64%. The framework's proficiency in nuanced control without a decline in fluency is further evident in its quantitative analysis across various tasks and comparative analysis against state-of-the-art methodologies.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.