Learning to Diversify Neural Text Generation via Degenerative Model (2309.12619v1)

Published 22 Sep 2023 in cs.CL

Abstract: Neural LLMs often fail to generate diverse and informative texts, limiting their applicability in real-world problems. While previous approaches have proposed to address these issues by identifying and penalizing undesirable behaviors (e.g., repetition, overuse of frequent words) from LLMs, we propose an alternative approach based on an observation: models primarily learn attributes within examples that are likely to cause degeneration problems. Based on this observation, we propose a new approach to prevent degeneration problems by training two models. Specifically, we first train a model that is designed to amplify undesirable patterns. We then enhance the diversity of the second model by focusing on patterns that the first model fails to learn. Extensive experiments on two tasks, namely LLMing and dialogue generation, demonstrate the effectiveness of our approach.

References (39)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Learning to Diversify Neural Text Generation via Degenerative Model (2309.12619v1)

Summary

Related Papers