Understanding the Effect of Model Compression on Social Bias in Large Language Models (2312.05662v2)
Abstract: LLMs trained with self-supervision on vast corpora of web text fit to the social biases of that text. Without intervention, these social biases persist in the model's predictions in downstream tasks, leading to representational harm. Many strategies have been proposed to mitigate the effects of inappropriate social biases learned during pretraining. Simultaneously, methods for model compression have become increasingly popular to reduce the computational burden of LLMs. Despite the popularity and need for both approaches, little work has been done to explore the interplay between these two. We perform a carefully controlled study of the impact of model compression via quantization and knowledge distillation on measures of social bias in LLMs. Longer pretraining and larger models led to higher social bias, and quantization showed a regularizer effect with its best trade-off around 20% of the original pretraining time.
- The Low-Resource Double Bind: An Empirical Study of Pruning for Low-Resource Machine Translation. In EMNLP (Findings), pages 3316–3333. Association for Computational Linguistics.
- Intriguing Properties of Quantization at Scale. CoRR, abs/2305.19268.
- Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling. CoRR, abs/2304.01373.
- Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets. In ACL/IJCNLP (1), pages 1004–1015. Association for Computational Linguistics.
- The Lottery Ticket Hypothesis for Pre-trained BERT Networks. In NeurIPS.
- Pieter Delobelle and Bettina Berendt. 2022. FairDistillation: Mitigating Stereotyping in Language Models. In ECML/PKDD (2), volume 13714 of Lecture Notes in Computer Science, pages 638–654. Springer.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pages 4171–4186. Association for Computational Linguistics.
- Depth-Adaptive Transformer. In ICLR. OpenReview.net.
- The Pile: An 800GB Dataset of Diverse Text for Language Modeling. CoRR, abs/2101.00027.
- Debiasing Pre-Trained Language Models via Efficient Fine-Tuning. In LT-EDI, pages 59–69. Association for Computational Linguistics.
- Bridging Fairness and Environmental Sustainability in Natural Language Processing. In EMNLP, pages 7817–7836. Association for Computational Linguistics.
- Distilling the knowledge in a neural network. In NIPS Workshop on Deep Learning.
- What Do Compressed Deep Neural Networks Forget?
- Characterising Bias in Compressed Models. CoRR, abs/2010.03058.
- Masahiro Kaneko and Danushka Bollegala. 2021. Debiasing Pre-trained Contextualised Embeddings. In EACL, pages 1256–1266. Association for Computational Linguistics.
- Masahiro Kaneko and Danushka Bollegala. 2022. Unmasking the Mask - Evaluating Social Biases in Masked Language Models. In AAAI, pages 11954–11962. AAAI Press.
- Debiasing Isn’t Enough! - on the Effectiveness of Debiasing MLMs and Their Social Biases in Downstream Tasks. In COLING, pages 1299–1310. International Committee on Computational Linguistics.
- Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference. In EMNLP (Findings), pages 3577–3599. Association for Computational Linguistics.
- Measuring Bias in Contextualized Word Representations. CoRR, abs/1906.07337.
- When Do Pre-Training Biases Propagate to Downstream Tasks? A Case Study in Text Summarization. In EACL, pages 3198–3211. Association for Computational Linguistics.
- Towards Debiasing Sentence Representations. In ACL, pages 5502–5515. Association for Computational Linguistics.
- Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy. In MLSys. mlsys.org.
- RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR, abs/1907.11692.
- An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models. In ACL (1), pages 1878–1898. Association for Computational Linguistics.
- StereoSet: Measuring stereotypical bias in pretrained language models. In ACL/IJCNLP (1), pages 5356–5371. Association for Computational Linguistics.
- CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models. In EMNLP (1), pages 1953–1967. Association for Computational Linguistics.
- Intriguing Properties of Compression on Multilingual Models. In EMNLP, pages 9092–9110. Association for Computational Linguistics.
- Language models are unsupervised multitask learners.
- Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection. In ACL, pages 7237–7256. Association for Computational Linguistics.
- DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. CoRR, abs/1910.01108.
- Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP. Trans. Assoc. Comput. Linguistics, 9:1408–1424.
- Measuring and Reducing Gendered Correlations in Pre-trained Models. CoRR, abs/2010.06032.
- Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression. In EMNLP (1), pages 10653–10659. Association for Computational Linguistics.
- Guangxuan Xu and Qingyuan Hu. 2022. Can Model Compression Improve NLP Fairness. CoRR, abs/2201.08542.