Evaluating Gender Bias of Pre-trained Language Models in Natural Language Inference by Considering All Labels (2309.09697v3)
Abstract: Discriminatory gender biases have been found in Pre-trained LLMs (PLMs) for multiple languages. In Natural Language Inference (NLI), existing bias evaluation methods have focused on the prediction results of one specific label out of three labels, such as neutral. However, such evaluation methods can be inaccurate since unique biased inferences are associated with unique prediction labels. Addressing this limitation, we propose a bias evaluation method for PLMs, called NLI-CoAL, which considers all the three labels of NLI task. First, we create three evaluation data groups that represent different types of biases. Then, we define a bias measure based on the corresponding label output of each data group. In the experiments, we introduce a meta-evaluation technique for NLI bias measures and use it to confirm that our bias measure can distinguish biased, incorrect inferences from non-biased incorrect inferences better than the baseline, resulting in a more accurate bias evaluation. We create the datasets in English, Japanese, and Chinese, and successfully validate the compatibility of our bias measure across multiple languages. Lastly, we observe the bias tendencies in PLMs of different languages. To our knowledge, we are the first to construct evaluation datasets and measure PLMs' bias from NLI in Japanese and Chinese.
- Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPSβ16, page 4356β4364, Red Hook, NY, USA. Curran Associates Inc.
- A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 632β642, Lisbon, Portugal. Association for Computational Linguistics.
- Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334):183β186.
- On the intrinsic and extrinsic fairness evaluation metrics for contextualized language representations. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 561β570, Dublin, Ireland. Association for Computational Linguistics.
- Bias in bios: A case study of semantic representation bias in a high-stakes setting. In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* β19, page 120β128, New York, NY, USA. Association for Computing Machinery.
- On measuring and mitigating biased inferences of word embeddings. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05):7659β7666.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171β4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Intrinsic bias metrics do not correlate with application bias. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1926β1940, Online. Association for Computational Linguistics.
- OCNLI: Original Chinese Natural Language Inference. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3512β3526, Online. Association for Computational Linguistics.
- Meichun Jiao and Ziyang Luo. 2021. Gender bias hidden behind Chinese word embeddings: The case of Chinese adjectives. In Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing, pages 8β15, Online. Association for Computational Linguistics.
- Masahiro Kaneko and Danushka Bollegala. 2022. Unmasking the mask β evaluating social biases in masked language models. Proceedings of the AAAI Conference on Artificial Intelligence, 36(11):11954β11962.
- Debiasing isnβt enough! β on the effectiveness of debiasing MLMs and their social biases in downstream tasks. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1299β1310, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Comparing intrinsic gender bias evaluation measures without using human annotated examples. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2857β2863, Dubrovnik, Croatia. Association for Computational Linguistics.
- Comparing intrinsic gender bias evaluation measures without using human annotated examples. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL, page (to appear), Dubrovnik, Croatia. Association for Computational Linguistics.
- The impact of debiasing on the performance of language models in downstream tasks is underestimated. arXiv preprint arXiv:2309.09092.
- Gender bias in masked language models for multiple languages. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2740β2750, Seattle, United States. Association for Computational Linguistics.
- JGLUE: Japanese general language understanding evaluation. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2957β2966, Marseille, France. European Language Resources Association.
- Adding chinese captions to images. In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, ICMR β16, page 271β275, New York, NY, USA. Association for Computing Machinery.
- Microsoft COCO: common objects in context. CoRR, abs/1405.0312.
- Roberta: A robustly optimized bert pretraining approach.
- Socially aware bias measurements for Hindi language representations. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1041β1052, Seattle, United States. Association for Computational Linguistics.
- Takashi Miyazaki and Nobuyuki Shimizu. 2016. Cross-lingual image caption generation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1780β1790, Berlin, Germany. Association for Computational Linguistics.
- StereoSet: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5356β5371, Online. Association for Computational Linguistics.
- CrowS-pairs: A challenge dataset for measuring social biases in masked language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1953β1967, Online. Association for Computational Linguistics.
- French CrowS-pairs: Extending a challenge dataset for measuring social bias in masked language models to a language other than English. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8521β8531, Dublin, Ireland. Association for Computational Linguistics.
- In-contextual bias suppression for large language models. arXiv preprint arXiv:2309.07251.
- Evaluating gender bias in natural language inference. CoRR, abs/2105.05541.
- Can existing methods debias languages other than English? first attempt to analyze and mitigate Japanese word embeddings. In Proceedings of the Second Workshop on Gender Bias in Natural Language Processing, pages 44β55, Barcelona, Spain (Online). Association for Computational Linguistics.
- Juman++: A morphological analysis toolkit for scriptio continua. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 54β59, Brussels, Belgium. Association for Computational Linguistics.
- Attention is all you need. In Advances in Neural Information Processing Systems, volumeΒ 30. Curran Associates, Inc.
- Measuring and reducing gendered correlations in pre-trained models. Technical report.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38β45, Online. Association for Computational Linguistics.
- Multilingualization of natural language inference datasets using machine translation. In Proceedings of the 244th Meeting of Natural Language Processing. In Japanese.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.