Emergent Mind

Aligner: One Global Token is Worth Millions of Parameters When Aligning Large Language Models

(2312.05503)
Published Dec 9, 2023 in cs.CL , cs.AI , and cs.LG

Abstract

We introduce Aligner, a novel Parameter-Efficient Fine-Tuning (PEFT) method for aligning multi-billion-parameter-sized LLMs. Aligner employs a unique design that constructs a globally shared set of tunable tokens that modify the attention of every layer. Remarkably with this method, even when using one token accounting for a mere 5,000 parameters, Aligner can still perform comparably well to state-of-the-art LLM adaptation methods like LoRA that require millions of parameters. This capacity is substantiated in both instruction following and value alignment tasks. Besides the multiple order-of-magnitude improvement in parameter efficiency, the insight Aligner provides into the internal mechanisms of LLMs is also valuable. The architectural features and efficacy of our method, in addition to our experiments demonstrate that an LLM separates its internal handling of "form" and "knowledge" in a somewhat orthogonal manner. This finding promises to motivate new research into LLM mechanism understanding and value alignment.

Overview

  • Introduction of Aligner, a Parameter-Efficient Fine-Tuning method for LLMs using a globally shared prefix token.

  • Contrasts with traditional methods by employing a single token across all transformer layers, impacting the attention mechanism efficiently.

  • Evaluation shows that Aligner competes with leading methods while using significantly fewer parameters, indicating potential for industrial application.

  • Tests with GPT-4 on benchmarks show Aligner's single token can rival traditional PEFT methods in instruction following and safety preference tasks.

  • Provides insights into LLM structure and potential implications for AI safety and control advancements.

Introduction

The paper introduces Aligner, an innovative Parameter-Efficient Fine-Tuning (PEFT) method designed for aligning multi-constructs in LLMs. Aligner employs a globally shared set of tunable tokens that significantly modifies the attention mechanism within every layer of a transformer-based model. Notably, Aligner demonstrates that even a minimal number of tokens, such as a single token representing a mere 5,000 parameters, can effectively align LLMs that traditionally contain millions of parameters.

Methodology

Aligner is inspired by previous PEFT methods like LoRA and LLaMA-Adapters, but diverges by introducing a global prefix token. This token is shared across all layers as opposed to layer-specific tokens employed by current methods. The primary feature of Aligner is the innovative use of a globally shared prefix token paradigm, contrasting the traditional layer-specific tokens. It applies a separate attention computation for these tokens, and the attention from prefix tokens, modulated by a layer-specific gating factor, is added back to the original attention scores.

Evaluation

The evaluation of Aligner focused on two form alignment tasks: instruction following and value alignment with human preferences. Tests conducted on GPT-4, leveraging the Vicuna Benchmark for instruction following and the Beaver Benchmark for safety preference, indicated that even a single global token could rival the performance of leading PEFT methods like LLaMa-Adapter and LoRA. The efficiency of Aligner offers the practical utility of fitting over a million instances alongside a 7-billion-parameter LLM within a 24GB GPU, providing implications for customized user-oriented models in industrial applications.

Conclusion and Discussion

Aligner’s success provides insights into the inner workings of LLMs, illustrating that “form” can indeed function orthogonally to “knowledge” or “ability.” The architecture's exceptional efficiency and performance on form alignment tasks suggest a distinct global component for "form," influencing how knowledge is applied within LLMs. This finding holds promise for future research into LLM mechanisms and aligning AI models with human values.

Ultimately, the paper positions Aligner not only as a highly efficient PEFT method for form alignment tasks but also as a beneficial tool for understanding LLM operational mechanisms. The findings showcase Aligner as a desirable option across various tasks, paving the way for further exploration in AI safety and control.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.