A survey on datasets for fairness-aware machine learning

Published 1 Oct 2021 in cs.LG | (2110.00530v3)

Abstract: As decision-making increasingly relies on Machine Learning (ML) and (big) data, the issue of fairness in data-driven AI systems is receiving increasing attention from both research and industry. A large variety of fairness-aware machine learning solutions have been proposed which involve fairness-related interventions in the data, learning algorithms and/or model outputs. However, a vital part of proposing new approaches is evaluating them empirically on benchmark datasets that represent realistic and diverse settings. Therefore, in this paper, we overview real-world datasets used for fairness-aware machine learning. We focus on tabular data as the most common data representation for fairness-aware machine learning. We start our analysis by identifying relationships between the different attributes, particularly w.r.t. protected attributes and class attribute, using a Bayesian network. For a deeper understanding of bias in the datasets, we investigate the interesting relationships using exploratory analysis.

Abstract PDF Upgrade to Chat

Citations (210)

View on Semantic Scholar

Summary

The paper provides a comprehensive survey of real-world tabular datasets used to evaluate fairness in ML, highlighting biases in finance, criminology, healthcare, and education.
The paper employs Bayesian networks and logistic regression to uncover causal relationships and measure bias using statistical parity, equalized odds, and ABROCA metrics.
The paper benchmarks existing datasets and calls for more diverse, contemporary, and temporally rich data to advance fairness research in AI.

A Survey on Datasets for Fairness-aware Machine Learning

The paper "A Survey on Datasets for Fairness-aware Machine Learning" offers a comprehensive examination of datasets utilized in the empirical evaluation of fairness-aware machine learning models. As the reliance on ML for decision-making intensifies across various sectors, addressing fairness in AI systems becomes crucial to mitigate discrimination based on protected attributes. The paper primarily focuses on real-world datasets, particularly those represented in tabular form, serving as a foundation for developing and testing fairness-aware machine learning solutions.

Key Aspects and Dataset Analysis

The authors underscore the importance of datasets in fairness-aware ML, elaborating on their utility in evaluating the bias of ML models. The survey categorizes datasets by application domains, namely finance, criminology, healthcare, and education, each embodying distinct fairness challenges and operational characteristics.

Financial Datasets: These datasets, like the Adult and German Credit datasets, are dominated by demographic features like age, sex, and race. They highlight the potential for bias in income prediction tasks and credit scoring, with considerable imbalance across protected groups.
Criminological Datasets: COMPAS datasets are prominently featured, exposing racial bias in recidivism prediction. The analysis reveals the intrinsic discriminatory tendencies in such datasets, necessitating fairness-oriented modeling.
Healthcare and Social Datasets: The Diabetes and Ricci datasets indicate bias issues related to race and gender in healthcare outcomes, touching on sensitive applications like medical readmissions and employment promotions.
Educational Datasets: Exemplified by the Student Performance and OULAD datasets, these datasets depict biases in academic outcomes based on gender and socio-economic indicators, calling for fairness in educational recommendation systems.

The methodology employed integrates Bayesian networks to uncover causal relationships in the datasets, identifying direct and indirect dependencies between attributes, including protected ones. This exploration assists in diagnosing the roots of bias, further informing fairness-aware ML approaches.

Experimental Evaluation

The paper conducts a preliminary experimental evaluation using logistic regression to assess the datasets under traditional bias mitigation metrics like statistical parity, equalized odds, and ABROCA (Absolute Between-ROC Area). These metrics offer insights into how predictive performance correlates with fairness, revealing substantial disparities across different datasets.

Implications and Future Directions

The implications of this research are manifold:

Benchmarks for Fairness Research: By cataloging these datasets, the paper creates a benchmark repository facilitating comparative analyses in fairness-aware ML.
Call for Comprehensive Datasets: The discussion extends to identifying gaps in existing datasets, urging the development of diverse datasets encapsulating various fairness scenarios across different domains and temporal contexts.
Integration of New Datasets: Recently introduced datasets like the Adult Reconstruction and ACS PUMS are highlighted as potential pathways for incorporating spatial and temporal diversity into fairness studies.

Conclusion

The survey concludes with a call to action for the ML community to prioritize the creation and utilization of diverse and contemporary datasets representing multiple perspectives and fairness facets. It also suggests the examination of synthetic and sequential datasets to address complex fairness dynamics over time, thus broadening the research landscape in fairness-aware machine learning. This cornerstone effort lays groundwork for informed model development that upholds fairness principles, paving the way for responsible AI adoption.

Markdown Report Issue