Controllable Text Generation in the Instruction-Tuning Era (2405.01490v1)
Abstract: While most research on controllable text generation has focused on steering base LLMs, the emerging instruction-tuning and prompting paradigm offers an alternate approach to controllability. We compile and release ConGenBench, a testbed of 17 different controllable generation tasks, using a subset of it to benchmark the performance of 9 different baselines and methods on Instruction-tuned LLMs. To our surprise, we find that prompting-based approaches outperform controllable text generation methods on most datasets and tasks, highlighting a need for research on controllable text generation with Instruction-tuned LLMs in specific. Prompt-based approaches match human performance on most stylistic tasks while lagging on structural tasks, foregrounding a need to study more varied constraints and more challenging stylistic tasks. To facilitate such research, we provide an algorithm that uses only a task dataset and a LLM with in-context capabilities to automatically generate a constraint dataset. This method eliminates the fields dependence on pre-curated constraint datasets, hence vastly expanding the range of constraints that can be studied in the future.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
- Jigsaw unintended bias in toxicity classification, 2019. URL https://kaggle.com/competitions/jigsaw-unintended-bias-in-toxicity-classification.
- Falcon-40b: an open large language model with state-of-the-art performance. Findings of the Association for Computational Linguistics: ACL, 2023:10755–10773, 2023.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Chatgpt vs gemini vs llama on multilingual sentiment analysis. arXiv preprint arXiv:2402.01715, 2024.
- Defending against alignment-breaking attacks via robustly aligned llm. arXiv preprint arXiv:2309.14348, 2023.
- Cognac: Controllable text generation with language constraints. In arXiv, 2022.
- Incorporating structured commonsense knowledge in story completion. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pp. 6244–6251, 2019.
- KR1442 Chowdhary and KR Chowdhary. Natural language processing. Fundamentals of artificial intelligence, pp. 603–649, 2020.
- A beginner’s guide and best practices for using crowdsourcing platforms for survey research: The case of amazon mechanical turk (mturk). Journal of Global Business Insights, 6(1):92–97, 2021.
- Plug and play language models: A simple approach to controlled text generation. 2020.
- Oxford English Dictionary. Oxford english dictionary. Simpson, Ja & Weiner, Esc, 3, 1989.
- Hierarchical neural story generation. arXiv preprint arXiv:1805.04833, 2018.
- Realtoxicityprompts: Evaluating neural toxic degeneration in language models. arXiv preprint arXiv:2009.11462, 2020.
- Openwebtext corpus. http://Skylion007.github.io/OpenWebTextCorpus, 2019.
- Improving controllable text generation with position-aware weighted decoding. In Findings of the Association for Computational Linguistics: ACL 2022, pp. 3449–3467, 2022.
- Teaching machines to read and comprehend. In NIPS, pp. 1693–1701, 2015. URL http://papers.nips.cc/paper/5945-teaching-machines-to-read-and-comprehend.
- The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751, 2019.
- A survey of nlp-related crowdsourcing hits: what works and what does not. arXiv preprint arXiv:2111.05241, 2021.
- Mistral 7b. arXiv preprint arXiv:2310.06825, 2023.
- Karen Sparck Jones. Natural language processing: a historical review. Current issues in computational linguistics: in honour of Don Walker, pp. 3–16, 1994.
- Grace: Discriminator-guided chain-of-thought reasoning. In Findings of the Association for Computational Linguistics: EMNLP 2023, pp. 15299–15328, 2023.
- Critic-guided decoding for controlled text generation. arXiv preprint arXiv:2212.10938, 2022.
- Gedi: Generative discriminator guided sequence generation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 4929–4952, 2021.
- Gradient-based constrained sampling from language models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 2251–2277, 2022.
- The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691, 2021.
- Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190, 2021.
- Dexperts: Decoding-time controlled text generation with experts and anti-experts. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 6691–6706, 2021.
- Bolt: Fast energy-based controlled text generation with tunable biases. arXiv preprint arXiv:2305.12018, 2023.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
- Nltk: The natural language toolkit. arXiv preprint cs/0205028, 2002.
- Neurologic a* esque decoding: Constrained text generation with lookahead heuristics. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 780–799, 2022.
- Inference-time policy adapters (ipa): Tailoring extreme-scale lms without fine-tuning. arXiv preprint arXiv:2305.15065, 2023.
- Focused prefix tuning for controllable text generation. arXiv preprint arXiv:2306.00369, 2023.
- Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142–150, Portland, Oregon, USA, June 2011. Association for Computational Linguistics. URL http://www.aclweb.org/anthology/P11-1015.
- Controllable text generation with neurally-decomposed oracle. Advances in Neural Information Processing Systems, 35:28125–28139, 2022.
- Mix and match: Learning-free controllable text generationusing energy language models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 401–415, 2022.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- Controllable natural language generation with contrastive prefixes. arXiv preprint arXiv:2202.13257, 2022.
- Cold decoding: Energy-based constrained text generation with langevin dynamics. Advances in Neural Information Processing Systems, 35:9538–9551, 2022.
- Improving language understanding with unsupervised learning. 2018.
- Amos Storkey et al. When training and test sets are different: characterizing learning transfer. Dataset shift in machine learning, 30(3-28):6, 2009.
- Evaluating large language models on controlled generation tasks. arXiv preprint arXiv:2310.14542, 2023.
- MN Team et al. Introducing mpt-7b: a new standard for open-source, commercially usable llms, 2023.
- Attention is all you need. In Neural Information Processing Systems, 2017. URL https://api.semanticscholar.org/CorpusID:13756489.
- Diverse beam search for improved description of complex scenes. In AAAI Conference on Artificial Intelligence, 2018. URL https://api.semanticscholar.org/CorpusID:19224034.
- Emergent abilities of large language models. arXiv preprint arXiv:2206.07682, 2022.
- Naturalproofs: Mathematical theorem proving in natural language. arXiv preprint arXiv:2104.01112, 2021.
- Grace: gradient-guided controllable retrieval for augmenting attribute-based text generation. In Findings of the Association for Computational Linguistics: ACL 2023, pp. 8377–8398, 2023.
- Fudge: Controlled text generation with future discriminators. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 3511–3535, 2021.
- Collie: Systematic construction of constrained text generation tasks. arXiv preprint arXiv:2307.08689, 2023.
- Why johnny can’t prompt: how non-ai experts try (and fail) to design llm prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp. 1–21, 2023.
- Discup: Discriminator cooperative unlikelihood prompt-tuning for controllable text generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 3392–3406, 2022.
- A survey of controllable text generation using transformer-based pre-trained language models. ACM Computing Surveys, 56(3):1–37, 2023a.
- Instruction tuning for large language models: A survey. arXiv preprint arXiv:2308.10792, 2023b.
- Character-level convolutional networks for text classification. In NIPS, 2015.
- Energy-based generative adversarial network. ArXiv, abs/1609.03126, 2016. URL https://api.semanticscholar.org/CorpusID:15876696.
- Click: Controllable text generation with sequence likelihood contrastive learning. arXiv preprint arXiv:2306.03350, 2023.
- Air-decoding: Attribute distribution reconstruction for decoding-time controllable text generation. arXiv preprint arXiv:2310.14892, 2023.
- Controlled text generation with natural language instructions. arXiv preprint arXiv:2304.14293, 2023.