CodecLM: Aligning Language Models with Tailored Synthetic Data (2404.05875v1)
Abstract: Instruction tuning has emerged as the key in aligning LLMs with specific task instructions, thereby mitigating the discrepancy between the next-token prediction objective and users' actual goals. To reduce the labor and time cost to collect or annotate data by humans, researchers start to explore the use of LLMs to generate instruction-aligned synthetic data. Recent works focus on generating diverse instructions and applying LLM to increase instruction complexity, often neglecting downstream use cases. It remains unclear how to tailor high-quality data to elicit better instruction-following abilities in different target instruction distributions and LLMs. To this end, we introduce CodecLM, a general framework for adaptively generating high-quality synthetic data for LLM alignment with different downstream instruction distributions and LLMs. Drawing on the Encode-Decode principles, we use LLMs as codecs to guide the data generation process. We first encode seed instructions into metadata, which are concise keywords generated on-the-fly to capture the target instruction distribution, and then decode metadata to create tailored instructions. We also introduce Self-Rubrics and Contrastive Filtering during decoding to tailor data-efficient samples. Extensive experiments on four open-domain instruction following benchmarks validate the effectiveness of CodecLM over the current state-of-the-arts.
- Palm 2 technical report. arXiv preprint arXiv:2305.10403.
- Ext5: Towards extreme multi-task scaling for transfer learning. arXiv preprint arXiv:2111.10952.
- Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv preprint arXiv:2204.05862.
- On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pages 610–623.
- Knowledge distillation: A good teacher is patient and consistent. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10925–10934.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Mixture of soft prompts for controllable data generation. arXiv preprint arXiv:2303.01580.
- Alpagasus: Training a better alpaca with fewer data. arXiv preprint arXiv:2307.08701.
- Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality.
- Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
- Steerlm: Attribute conditioned sft as an (user-steerable) alternative to rlhf. arXiv preprint arXiv:2310.05344.
- Alpacafarm: A simulation framework for methods that learn from human feedback. arXiv preprint arXiv:2305.14387.
- Avia Efrat and Omer Levy. 2020. The turking test: Can language models understand instructions? arXiv preprint arXiv:2010.11982.
- Promptbreeder: Self-referential self-improvement via prompt evolution. arXiv preprint arXiv:2309.16797.
- Bias and fairness in large language models: A survey. arXiv preprint arXiv:2309.00770.
- Koala: A dialogue model for academic research. Blog post.
- Laura Hanu and Unitary team. 2020. Detoxify. Github. https://github.com/unitaryai/detoxify.
- Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
- Unnatural instructions: Tuning language models with (almost) no human labor. arXiv preprint arXiv:2212.09689.
- Distilling step-by-step! outperforming larger language models with less training data and smaller model sizes. arXiv preprint arXiv:2305.02301.
- Baseline defenses for adversarial attacks against aligned language models. arXiv preprint arXiv:2309.00614.
- Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
- Openassistant conversations–democratizing large language model alignment. arXiv preprint arXiv:2304.07327.
- Mark A Kramer. 1991. Nonlinear principal component analysis using autoassociative neural networks. AIChE journal, 37(2):233–243.
- Applying large language models and chain-of-thought for automatic scoring. arXiv preprint arXiv:2312.03748.
- Self-alignment with instruction backtranslation. arXiv preprint arXiv:2308.06259.
- Contrastive decoding: Open-ended text generation as optimization. arXiv preprint arXiv:2210.15097.
- Less is more: Task-aware layer-wise distillation for language model compression. In International Conference on Machine Learning, pages 20852–20867. PMLR.
- Wanli: Worker and ai collaboration for natural language inference dataset creation. arXiv preprint arXiv:2201.05955.
- Prompt injection attack against llm-integrated applications. arXiv preprint arXiv:2306.05499.
- Self-refine: Iterative refinement with self-feedback. arXiv preprint arXiv:2303.17651.
- Tuning language models as training data generators for augmentation-enhanced few-shot learning. In International Conference on Machine Learning, pages 24457–24477. PMLR.
- OpenAI. 2023a. Gpt-4 technical report. ArXiv, abs/2303.08774.
- OpenAI. 2023b. Introducing gpts. https://openai.com/blog/introducing-gpts.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
- Instruction tuning with gpt-4. arXiv preprint arXiv:2304.03277.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
- Timo Schick and Hinrich Schütze. 2021. Generating datasets with pretrained language models. arXiv preprint arXiv:2104.07540.
- Challenging big-bench tasks and whether chain-of-thought can solve them. arXiv preprint arXiv:2210.09261.
- Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
- Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805.
- Language models get a gender makeover: Mitigating gender bias with few-shot data interventions. arXiv preprint arXiv:2306.04597.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Multitask prompted training enables zero-shot task generalization. In International Conference on Learning Representations.
- How far can camels go? exploring the state of instruction tuning on open resources. arXiv preprint arXiv:2306.04751.
- Self-instruct: Aligning language model with self generated instructions. arXiv preprint arXiv:2212.10560.
- Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
- Wizardlm: Empowering large language models to follow complex instructions. arXiv preprint arXiv:2304.12244.
- Large language models as optimizers. arXiv preprint arXiv:2309.03409.
- Large language model as attributed training data generator: A tale of diversity and bias. arXiv preprint arXiv:2306.15895.
- A preliminary study of the intrinsic relationship between complexity and alignment. arXiv preprint arXiv:2308.05696.
- Judging llm-as-a-judge with mt-bench and chatbot arena. arXiv preprint arXiv:2306.05685.
- Lima: Less is more for alignment. arXiv preprint arXiv:2305.11206.
- Instruction-following evaluation for large language models. arXiv preprint arXiv:2311.07911.
- Universal and transferable adversarial attacks on aligned language models.