Gender-preserving Debiasing for Pre-trained Word Embeddings (1906.00742v1)

Published 3 Jun 2019 in cs.CL and cs.LG

Abstract: Word embeddings learnt from massive text collections have demonstrated significant levels of discriminative biases such as gender, racial or ethnic biases, which in turn bias the down-stream NLP applications that use those word embeddings. Taking gender-bias as a working example, we propose a debiasing method that preserves non-discriminative gender-related information, while removing stereotypical discriminative gender biases from pre-trained word embeddings. Specifically, we consider four types of information: \emph{feminine}, \emph{masculine}, \emph{gender-neutral} and \emph{stereotypical}, which represent the relationship between gender vs. bias, and propose a debiasing method that (a) preserves the gender-related information in feminine and masculine words, (b) preserves the neutrality in gender-neutral words, and (c) removes the biases from stereotypical words. Experimental results on several previously proposed benchmark datasets show that our proposed method can debias pre-trained word embeddings better than existing SoTA methods proposed for debiasing word embeddings while preserving gender-related but non-discriminative information.

Citations (128)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Gender-preserving Debiasing for Pre-trained Word Embeddings (1906.00742v1)

Summary

Related Papers