Uncovering Name-Based Biases in Large Language Models Through Simulated Trust Game (2404.14682v1)

Published 23 Apr 2024 in cs.CY

Abstract: Gender and race inferred from an individual's name are a notable source of stereotypes and biases that subtly influence social interactions. Abundant evidence from human experiments has revealed the preferential treatment that one receives when one's name suggests a predominant gender or race. As LLMs acquire more capabilities and begin to support everyday applications, it becomes crucial to examine whether they manifest similar biases when encountering names in a complex social interaction. In contrast to previous work that studies name-based biases in LLMs at a more fundamental level, such as word representations, we challenge three prominent models to predict the outcome of a modified Trust Game, a well-publicized paradigm for studying trust and reciprocity. To ensure the internal validity of our experiments, we have carefully curated a list of racially representative surnames to identify players in a Trust Game and rigorously verified the construct validity of our prompts. The results of our experiments show that our approach can detect name-based biases in both base and instruction-tuned models.

Authors (3)

Yumou Wei (5 papers)
Paulo F. Carvalho (5 papers)
John Stamper (14 papers)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/WGOV/status/1783004487730237491

Uncovering Name-Based Biases in Large Language Models Through Simulated Trust Game (2404.14682v1)

Summary

Related Papers

Tweets