Undesirable Biases in NLP: Addressing Challenges of Measurement (2211.13709v4)
Abstract: As LLMs and NLP technology rapidly develop and spread into daily life, it becomes crucial to anticipate how their use could harm people. One problem that has received a lot of attention in recent years is that this technology has displayed harmful biases, from generating derogatory stereotypes to producing disparate outcomes for different social groups. Although a lot of effort has been invested in assessing and mitigating these biases, our methods of measuring the biases of NLP models have serious problems and it is often unclear what they actually measure. In this paper, we provide an interdisciplinary approach to discussing the issue of NLP model bias by adopting the lens of psychometrics -- a field specialized in the measurement of concepts like bias that are not directly observable. In particular, we will explore two central notions from psychometrics, the construct validity and the reliability of measurement tools, and discuss how they can be applied in the context of measuring model bias. Our goal is to provide NLP practitioners with methodological tools for designing better bias measures, and to inspire them more generally to explore tools from psychometrics when working on bias measurement tools.
- Constructing a Psychometric Testbed for Fair Natural Language Processing. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 3748–3758, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- On measuring social biases in prompt-based multi-task learning. In Findings of the Association for Computational Linguistics: NAACL 2022, pp. 551–564.
- Challenges in Measuring Bias via Open-Ended Language Generation. In Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pp. 76–76, Seattle, Washington. Association for Computational Linguistics.
- Identifying Annotator Bias: A new IRT-based method for bias identification. In Proceedings of the 28th International Conference on Computational Linguistics, pp. 4787–4797, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Bad seeds: Evaluating lexical methods for bias measurement. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1889–1904.
- How Reliable are Model Diagnostics?. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 1778–1785, Online. Association for Computational Linguistics.
- Inter-Coder Agreement for Computational Linguistics. Computational Linguistics, 34(4), 555–596.
- Promptsource: An integrated development environment and repository for natural language prompts. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 93–104.
- Beyond debiasing: Regulating ai and its inequalities. EDRi Report.
- The problem with bias: Allocative versus representational harms in machine learning. In 9th Annual conference of the special interest group for computing, information and society.
- On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? . In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, pp. 610–623, New York, NY, USA. Association for Computing Machinery. tex.ids= bender2021DangersStochasticParrotsa.
- An Agreement Measure for Determining Inter-Annotator Reliability of Human Judgements on Affective Text. In Coling 2008: Proceedings of the workshop on Human Judgements in Computational Linguistics, pp. 58–65, Manchester, UK. Coling 2008 Organizing Committee.
- Pythia: A suite for analyzing large language models across training and scaling. In International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, Vol. 202 of Proceedings of Machine Learning Research, pp. 2397–2430. PMLR.
- Language (Technology) is Power: A Critical Survey of “Bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5454–5476, Online. Association for Computational Linguistics.
- Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1004–1015, Online. Association for Computational Linguistics.
- Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pp. 4356–4364, Red Hook, NY, USA. Curran Associates Inc.
- On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.
- Trustworthy social bias measurement. arXiv preprint arXiv:2212.11672.
- Identifying and Reducing Gender Bias in Word-Level Language Models. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pp. 7–15, Minneapolis, Minnesota. Association for Computational Linguistics.
- The Concept of Validity. Psychological Review, 111(4), 1061–1071. Place: US Publisher: American Psychological Association.
- Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, Vol. 33, pp. 1877–1901. Curran Associates, Inc.
- Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183–186.
- Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological bulletin, 56(2), 81–105.
- On the Intrinsic and Extrinsic Fairness Evaluation Metrics for Contextualized Language Representations. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 561–570, Dublin, Ireland. Association for Computational Linguistics.
- Socially Responsible AI Algorithms: Issues, Purposes, and Challenges. Journal of Artificial Intelligence Research, 71, 1137–1181.
- Examining covert gender bias: A case study in turkish and english machine translation models. In Proceedings of the 14th International Conference on Natural Language Generation, pp. 55–63.
- Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302. Place: US Publisher: American Psychological Association.
- Underspecification Presents Challenges for Credibility in Modern Machine Learning. Journal of Machine Learning Research, 23(226), 1–61.
- Danesi, M. (2014). Dictionary of media and communications. Routledge.
- Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting. In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* ’19, pp. 120–128, New York, NY, USA. Association for Computing Machinery.
- Sparse Interventions in Language Models with Differentiable Masking. In Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pp. 16–27, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- The benchmark lottery. arXiv preprint arXiv:2107.07002.
- Measuring Fairness with Biased Rulers: A Comparative Study on Bias Metrics for Pre-trained Language Models. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1693–1706, Seattle, United States. Association for Computational Linguistics.
- Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 1968–1994, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- On Measures of Biases and Harms in NLP. In Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, pp. 246–267, Online only. Association for Computational Linguistics.
- Multi-Dimensional Gender Bias Classification. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 314–331, Online. Association for Computational Linguistics.
- Assessing the Reliability of Word Embedding Gender Bias Measures. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 10012–10034, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Gender stereotypes have changed: A cross-temporal meta-analysis of U.S. public opinion polls from 1946 to 2018. American Psychologist, 75, 301–315. place: US publisher: American Psychological Association.
- Ethayarajh, K. (2020). Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 2914–2919, Online. Association for Computational Linguistics.
- Understanding Undesirable Word Embedding Associations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1696–1705, Florence, Italy. Association for Computational Linguistics.
- Evaluating the construct validity of text embeddings with application to survey questions. EPJ Data Science, 11(1), 39.
- A Survey of Race, Racism, and Anti-Racism in NLP. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1905–1925, Online. Association for Computational Linguistics.
- Unsupervised Discovery of Implicit Gender Bias. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 596–608, Online. Association for Computational Linguistics.
- Large scale crowdsourcing and characterization of twitter abusive behavior. In Proceedings of the international AAAI conference on web and social media, Vol. 12.
- The (Im)possibility of fairness: different value systems require different mechanisms for fair decision making. Communications of the ACM, 64(4), 136–143.
- Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences, 115(16), E3635–E3644.
- Time travel in llms: Tracing data contamination in large language models. arXiv preprint arXiv:2308.08493.
- Intrinsic Bias Metrics Do Not Correlate with Application Bias. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1926–1940, Online. Association for Computational Linguistics.
- This prompt is measuring <<<MASK>>>: Evaluating bias evaluation in language models. arXiv preprint arXiv:2305.12757.
- Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 609–614, Minneapolis, Minnesota. Association for Computational Linguistics.
- Analyzing Gender Representation in Multilingual Models. In Proceedings of the 7th Workshop on Representation Learning for NLP, pp. 67–77, Dublin, Ireland. Association for Computational Linguistics.
- Measuring individual differences in implicit cognition: the implicit association test. Journal of personality and social psychology, 74(6), 1464.
- Understanding and using the Implicit Association Test: III. Meta-analysis of predictive validity. Journal of Personality and Social Psychology, 97(1), 17–41. Place: US Publisher: American Psychological Association.
- Item response theory: Principles and applications. Springer Science & Business Media.
- Harrington, D. (2009). Confirmatory Factor Analysis. Oxford University Press, USA.
- Implicit association tests: Stimuli validation from participant responses. Forthcoming.
- Characterising bias in compressed models. arXiv preprint arXiv:2010.03058.
- Social biases in NLP models as barriers for persons with disabilities. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5491–5501, Online. Association for Computational Linguistics.
- Measurement and Fairness. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, pp. 375–385, New York, NY, USA. Association for Computing Machinery.
- Gender Bias Hidden Behind Chinese Word Embeddings: The Case of Chinese Adjectives. In Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing, pp. 8–15, Online. Association for Computational Linguistics.
- How AI can distort human beliefs. Science, 380(6651), 1222–1223.
- Confronting Abusive Language Online: A Survey from the Ethical and Human Rights Perspective. Journal of Artificial Intelligence Research, 71, 431–478.
- Kline, P. (2014). An easy guide to factor analysis. Routledge.
- Krumpal, I. (2013). Determinants of social desirability bias in sensitive surveys: a literature review. Quality & quantity, 47(4), 2025–2047.
- Building an evaluation scale using item response theory. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing, Vol. 2016, p. 648. NIH Public Access.
- Sex differences in heart failure. European Heart Journal, 40(47), 3859–3868c.
- A systematic study and comprehensive evaluation of chatgpt on benchmark datasets. arXiv preprint arXiv:2305.18486.
- Don’t Forget About Pronouns: Removing Gender Bias in Language Models Without Losing Factual Gender Information. In Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pp. 17–29, Seattle, Washington. Association for Computational Linguistics.
- Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Computing Surveys, 55(9), 195:1–195:35.
- The Flan collection: Designing data and methods for effective instruction tuning. In International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, Vol. 202 of Proceedings of Machine Learning Research, pp. 22631–22648. PMLR.
- Socially Aware Bias Measurements for Hindi Language Representations. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1041–1052, Seattle, United States. Association for Computational Linguistics.
- The Unified and Holistic Method Gamma (γ𝛾\gammaitalic_γ) for Inter-Annotator Agreement Measure and Alignment. Computational Linguistics, 41(3), 437–479.
- On Measuring Social Biases in Sentence Encoders. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 622–628, Minneapolis, Minnesota. Association for Computational Linguistics.
- An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1878–1898, Dublin, Ireland. Association for Computational Linguistics.
- Job recruitment and job seeking processes: how technology can help. It professional, 16(5), 41–49.
- StereoSet: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 5356–5371, Online. Association for Computational Linguistics.
- CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1953–1967, Online. Association for Computational Linguistics.
- Standards for talking and thinking about validity. Psychological Methods, 18, 301–319. Place: US Publisher: American Psychological Association.
- Fair is better than sensational: Man is to doctor as woman is to doctor. Computational Linguistics, 46(2), 487–497.
- Promoting an open research culture. Science, 348(6242), 1422–1425. Publisher: American Association for the Advancement of Science.
- French CrowS-Pairs: Extending a challenge dataset for measuring social bias in masked language models to a language other than English. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 8521–8531, Dublin, Ireland. Association for Computational Linguistics.
- Choose your lenses: Flaws in gender bias evaluation. GeBNLP 2022, 151.
- The Emergence of Gender Associations in Child Language Development. Cognitive Science, 46(6), e13146.
- Multitask prompted training enables zero-shot task generalization. In ICLR 2022-Tenth International Conference on Learning Representations.
- Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100.
- Self-diagnosis and self-debiasing: A proposal for reducing corpus-based bias in nlp. Transactions of the Association for Computational Linguistics, 9, 1408–1424.
- The Role of Protected Class Word Lists in Bias Identification of Contextualized Word Representations. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing, pp. 55–61, Florence, Italy. Association for Computational Linguistics.
- Quantifying social biases using templates is unreliable. arXiv preprint arXiv:2210.04337.
- A survey on gender bias in natural language processing. arXiv preprint arXiv:2112.14168.
- Evaluating gender bias in machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1679–1684.
- You reap what you sow: On the Challenges of Bias Evaluation Under Multilingual Settings. In Proceedings of BigScience Episode #5 – Workshop on Challenges & Perspectives in Creating Large Language Models, pp. 26–41, virtual+Dublin. Association for Computational Linguistics.
- An Italian lexical resource for incivility detection in online discourses. Quality & Quantity, 56.
- Natural Language Processing to the Rescue? Extracting ”Situational Awareness” Tweets During Mass Emergency. Proceedings of the International AAAI Conference on Web and Social Media, 5(1), 385–392. Number: 1.
- Investigating Gender Bias in Language Models Using Causal Mediation Analysis. In Advances in Neural Information Processing Systems, Vol. 33, pp. 12388–12401. Curran Associates, Inc.
- Women through the glass ceiling: gender asymmetries in Wikipedia. EPJ Data Science, 5(1), 5.
- Diachronic Analysis of German Parliamentary Proceedings: Ideological Shifts through the Lens of Political Biases. In 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pp. 51–60.
- The Birth of Bias: A case study on the evolution of gender bias in an English language model. In Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pp. 75–75, Seattle, Washington. Association for Computational Linguistics.
- Clinical information extraction applications: a literature review. Journal of biomedical informatics, 77, 34–49.
- Warrens, M. J. (2015). On cronbach’s alpha as the mean of all split-half reliabilities. In Quantitative Psychology Research: The 78th Annual Meeting of the Psychometric Society, pp. 293–300. Springer.
- Way, A. (2018). Quality expectations of machine translation. In Translation quality assessment, pp. 159–178. Springer.
- Measuring and reducing gendered correlations in pre-trained models. arXiv preprint arXiv:2010.06032.
- Weinberg, L. (2022). Rethinking Fairness: An Interdisciplinary Survey of Critiques of Hegemonic ML Fairness Approaches. Journal of Artificial Intelligence Research, 74, 75–109.
- The analysis of biological data, Vol. 768. Roberts Publishers.
- Genetics of cardiovascular disease: Importance of sex and ethnicity. Atherosclerosis, 241(1), 219–228.
- k-rater reliability: The correct unit of reliability for aggregated human annotations. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 378–384.
- Cross-replication Reliability - An Empirical Approach to Interpreting Inter-rater Reliability. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 7053–7065, Online. Association for Computational Linguistics.
- Annotating Online Misogyny. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 3181–3197, Online. Association for Computational Linguistics.
- Robustness and Reliability of Gender Bias Assessment in Word Embeddings: The Role of Base Pairs. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, pp. 759–769, Suzhou, China. Association for Computational Linguistics.
- Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068.
- Mitigating bias against non-native accents. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Vol. 2022, pp. 3168–3172.
- Gender Bias in Contextualized Word Embeddings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 629–634, Minneapolis, Minnesota. Association for Computational Linguistics.
- Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp. 15–20, New Orleans, Louisiana. Association for Computational Linguistics.
- How does nlp benefit legal system: A summary of legal artificial intelligence. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5218–5230.
- Oskar van der Wal (9 papers)
- Dominik Bachmann (1 paper)
- Alina Leidinger (8 papers)
- Leendert van Maanen (2 papers)
- Willem Zuidema (32 papers)
- Katrin Schulz (11 papers)