Emergent Mind

Aligning LLM Agents by Learning Latent Preference from User Edits

(2404.15269)
Published Apr 23, 2024 in cs.CL , cs.AI , cs.IR , and cs.LG

Abstract

We study interactive learning of language agents based on user edits made to the agent's output. In a typical setting such as writing assistants, the user interacts with a language agent to generate a response given a context, and may optionally edit the agent response to personalize it based on their latent preference, in addition to improving the correctness. The edit feedback is naturally generated, making it a suitable candidate for improving the agent's alignment with the user's preference, and for reducing the cost of user edits over time. We propose a learning framework, PRELUDE that infers a description of the user's latent preference based on historic edit data and using it to define a prompt policy that drives future response generation. This avoids fine-tuning the agent, which is costly, challenging to scale with the number of users, and may even degrade its performance on other tasks. Furthermore, learning descriptive preference improves interpretability, allowing the user to view and modify the learned preference. However, user preference can be complex and vary based on context, making it challenging to learn. To address this, we propose a simple yet effective algorithm named CIPHER that leverages a LLM to infer the user preference for a given context based on user edits. In the future, CIPHER retrieves inferred preferences from the k-closest contexts in the history, and forms an aggregate preference for response generation. We introduce two interactive environments -- summarization and email writing, for evaluation using a GPT-4 simulated user. We compare with algorithms that directly retrieve user edits but do not learn descriptive preference, and algorithms that learn context-agnostic preference. On both tasks, CIPHER achieves the lowest edit distance cost and learns preferences that show significant similarity to the ground truth preferences

Interactive learning process from user edits; agent uses plain text revisions as feedback.

Overview

  • This paper introduces a learning framework named PRELUDE, which utilizes edits made by users to improve and personalize future responses from LLMs without retraining the entire model.

  • A key component, CIPHER, enhances the ability of LLMs to predict user preferences by analyzing historical edits and preferences to improve response accuracy and reduce the need for user corrections over time.

  • Empirical tests show that CIPHER outperforms baseline methods in simulated environments for tasks like summarization and email writing, demonstrating practical effectiveness in improving user satisfaction without massive computational costs.

Exploring Preference Learning through User Edits in Language Models

Introduction

Language agents, especially those powered by LLMs, are becoming increasingly integral to applications ranging from writing assistants to customer support. While LLMs exhibit robust zero-shot capabilities, their generic responses often lack personalization, which can be crucial for user-specific tasks. A natural and frequent form of user feedback in these applications is the edits users make to the responses generated by these agents. This paper introduces a novel learning framework called PRELUDE, which stands for PREference Learning from User's Direct Edits, focusing on harnessing these user edits not just to adjust responses on the fly but to understand and adapt to user-specific preferences over time.

The Mechanics of PRELUDE and CIPHER

PRELUDE does not fine-tune the underlying LLM, addressing scalability and cost issues associated with per-user model adjustments. Instead, it creates a 'prompt policy', which predicts and incorporates user preferences into the model's generation process based on previously observed edits. This mechanism involves generating inferring preferences without the computational and logistical overhead of model retraining.

A critical component of PRELUDE is an algorithm called CIPHER (Consolidates Induced Preferences based on Historical Edits with Retrieval). CIPHER operates by retrieving the user's past edits and preferences, identifying patterns, and using these to predict future preferences in similar contexts. Notably, this process typically requires shorter prompts compared to methods that use longer contextual retrievals, thus reducing the overhead on the model.

Empirical Evaluations and Findings

The authors tested CIPHER within two simulated interactive environments representing common use cases of language models: summarization and email writing tasks. The simulated users in these environments interacted with the agent, providing naturalistic edits based on latent preferences specific to different document types. For instance, a movie review might elicit preferences for a concise, point-wise summary, while an academic abstract might require detailed explanations.

In both environments, CIPHER demonstrated a stronger ability to reduce the edit distance - a metric quantifying the discrepancy between the generated text and the user-edited version - compared to several baseline methods. Specifically, it outperformed the no-learning baseline by adapting to user preferences over time, as indicated by the progressively decreasing requirement for user edits.

Theoretical Contributions and Practical Implications

Beyond empirical performance, the theoretical nuances of this work lie in its approach to aggregate and refine user preferences effectively. By leveraging historical data and avoiding the retraining of base models, PRELUDE and CIPHER bring an economically feasible personalization to language models without compromising the model's core capabilities.

For practical application, embedding a system like CIPHER into consumer-facing LLM applications could significantly enhance user satisfaction by reducing the need for constant corrections and providing responses that feel more intuitively aligned with individual user styles and preferences.

Future Prospects

Looking ahead, this approach could pave the way for more nuanced user-model interactions where the model not only responds accurately but also evolves in line with user preferences seamlessly. Further research might explore the limits of preference learning, particularly in how detailed and subtle preferences can be captured without explicit user feedback.

Furthermore, investigating the robustness of such systems in diverse real-world scenarios, outside of controlled experimental conditions, would be crucial to understanding their utility and areas for enhancement.

Conclusion

The development of CIPHER within the PRELUDE framework marks a significant step towards more personalized, user-aware applications of LLMs. By intelligently leveraging user edits, not merely as feedback but as a window into user preferences, this research contributes both to the enhancement of user experience and to the operational efficiency of deploying language models in personalized applications.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

YouTube