CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models

Published 30 Sep 2020 in cs.CL and cs.AI | (2010.00133v1)

Abstract: Pretrained LLMs, especially masked LLMs (MLMs) have seen success across many NLP tasks. However, there is ample evidence that they use the cultural biases that are undoubtedly present in the corpora they are trained on, implicitly creating harm with biased representations. To measure some forms of social bias in LLMs against protected demographic groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark (CrowS-Pairs). CrowS-Pairs has 1508 examples that cover stereotypes dealing with nine types of bias, like race, religion, and age. In CrowS-Pairs a model is presented with two sentences: one that is more stereotyping and another that is less stereotyping. The data focuses on stereotypes about historically disadvantaged groups and contrasts them with advantaged groups. We find that all three of the widely-used MLMs we evaluate substantially favor sentences that express stereotypes in every category in CrowS-Pairs. As work on building less biased models advances, this dataset can be used as a benchmark to evaluate progress.

Abstract PDF Upgrade to Chat

Authors (4)

Citations (570)

View on Semantic Scholar

Summary

The paper presents a novel dataset for assessing bias in masked language models through contrasting sentence pairs.
It evaluates models like BERT, RoBERTa, and ALBERT using pseudo-log-likelihood scores, revealing significant bias across nine categories.
The findings underline a trade-off between model performance and bias, prompting the need for improved debiasing techniques.

The paper "CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked LLMs" presents a dataset designed to quantify social biases present within widely-utilized masked LLMs (MLMs). Authored by Nikita Nangia, Clara Vania, Rasika Bhalerao, and Samuel R. Bowman, the paper articulates a methodological approach to evaluate the extent of stereotypes encoded in LLMs trained on real-world text. MLMs are prolific in NLP advancements, but they also inherit cultural biases from their training corpora. The CrowS-Pairs dataset addresses this critical challenge by offering a resource for measuring bias, particularly against historically marginalized groups in the United States.

Methodology and Construction

CrowS-Pairs encompasses 1508 sentence pairs, each contrasting a stereotype-laden statement with a minimally distant sentence devoid of such bias. The dataset spans nine bias categories, including race, gender, sexual orientation, religion, age, nationality, disability, physical appearance, and socioeconomic status. Uniquely, CrowS-Pairs is crowdsourced, broadening the diversity of stereotypical content and sentence structures compared to template-based datasets.

The dataset is designed to examine whether MLMs systematically favor sentences embedded with stereotypes. Each pair contains a sentence about a disadvantaged group juxtaposed with one about an advantaged counterpart, differing only in the group identification. This construction allows for a focused analysis of bias-conditioned model preferences.

Evaluation of Models

The study evaluates three prominent MLMs: BERT, RoBERTa, and ALBERT, which all exhibit measurable bias across categories. Using pseudo-log-likelihood scoring, the authors test the models' propensity to assign higher likelihoods to stereotype-congruent sentences. Results reveal that despite their technological advancements, these models exhibit significant biases against protected groups, underscoring the need for careful deployment consideration.

Moreover, an analysis of model confidence suggests that larger, more performant models like ALBERT not only reflect greater bias but do so with higher confidence. This finding suggests a trade-off between model capability and bias perpetuation, posing substantial implications for model design and training procedures.

Implications and Future Directions

CrowS-Pairs serves as a diagnostic tool within the broader effort to develop more socially aware, less biased LLMs. The dataset's insights can guide the enhancement of debiasing techniques and underpin the evaluation of new models aimed at reducing harmful stereotype propagation.

The authors acknowledge the inherent complexity of social biases, recommending that future work extend beyond CrowS-Pairs. This dataset is a step toward quantifying ML biases and should be viewed as part of a larger effort, rather than a holistic solution.

Continued exploration is necessary to develop robust evaluation metrics for autoregressive models and to devise debiasing methodologies that maintain downstream task performance.

In conclusion, CrowS-Pairs offers a crucial benchmark for evaluating social biases in MLMs, facilitating a more nuanced understanding of bias in NLP models and contributing to the development of more equitable technology. The dataset reliably characterizes the stereotypes prevalent in LLMs and poses essential questions for future AI research and ethical consideration.

Markdown Report Issue