Emergent Mind

Abstract

Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. Given the recent popularity and adoption of language generation technologies, the potential to further marginalize this population only grows. Although a multitude of NLP fairness literature focuses on illuminating and addressing gender biases, assessing gender harms for TGNB identities requires understanding how such identities uniquely interact with societal gender norms and how they differ from gender binary-centric perspectives. Such measurement frameworks inherently require centering TGNB voices to help guide the alignment between gender-inclusive NLP and whom they are intended to serve. Towards this goal, we ground our work in the TGNB community and existing interdisciplinary literature to assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation (OLG). This social knowledge serves as a guide for evaluating popular LLMs on two key aspects: (1) misgendering and (2) harmful responses to gender disclosure. To do this, we introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community. We discover a dominance of binary gender norms reflected by the models; LLMs least misgendered subjects in generated text when triggered by prompts whose subjects used binary pronouns. Meanwhile, misgendering was most prevalent when triggering generation with singular they and neopronouns. When prompted with gender disclosures, TGNB disclosure generated the most stigmatizing language and scored most toxic, on average. Our findings warrant further research on how TGNB harms manifest in LLMs and serve as a broader case study toward concretely grounding the design of gender-inclusive AI in community voices and interdisciplinary literature.

Overview

  • The paper investigates biases in language generation models against TGNB individuals and introduces the TANGO dataset.

  • TANGO contains text instances from the Nonbinary Wiki, enabling assessment of misgendering and harmful language in AI responses.

  • Findings show prevalent misgendering by LLMs, especially with lesser-known neopronouns, and harmful responses to gender identity disclosures.

  • The study reveals a bias towards traditional gender norms and difficulties in handling neopronouns indicating the need for more inclusive AI.

  • Future improvements suggested include diverse data pretraining, refined tokenizers, and in-context learning for TGNB representation in AI.

The paper "Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation" presents an investigation into how language generation technologies, such as LLMs, may inadvertently marginalize or discriminate against Transgender and Non-Binary (TGNB) individuals through biases in generated text. The study provides a comprehensive evaluation framework for measuring such biases and introduces a dataset named TANGO, purpose-built to assess misgendering and harmful language in response to gender disclosures.

The authors gather data from the Nonbinary Wiki to construct TANGO, which consists of real-world text instances and templates related to TGNB experiences. The dataset includes a set of prompts for examining pronoun consistency (to detect misgendering) and another set for measuring potentially harmful responses to disclosures of gender identity. Results indicate widespread misgendering by LLMs, especially when prompts include lesser-known TGNB-specific pronouns (neopronouns), and reveal that LLMs are prone to generate harmful responses to gender disclosures, particularly for non-binary and gender-fluid identities.

The paper finds that generated texts are less harmful when binary gender pronouns are used, revealing a bias toward traditional gender norms. Additionally, language models struggle with the grammatical rules for neopronouns, hinting at broader issues of pronoun recognition and representation in AI systems. A case study with ChatGPT demonstrates the need for further research and development of more inclusive language technologies.

The research warns against the erasure of TGNB identities and suggests avenues for future improvement, such as pretraining with more diverse corpora, refining tokenizers to preserve the structural integrity of TGNB pronouns, and employing in-context learning techniques with various TGNB examples. The authors call for centering marginalized voices in AI and recommend increased scrutiny of the normative assumptions behind toxicity annotation and language model development.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.