Differential Privacy Has Disparate Impact on Model Accuracy

Published 28 May 2019 in cs.LG, cs.CR, and stat.ML | (1905.12101v2)

Abstract: Differential privacy (DP) is a popular mechanism for training machine learning models with bounded leakage about the presence of specific points in the training data. The cost of differential privacy is a reduction in the model's accuracy. We demonstrate that in the neural networks trained using differentially private stochastic gradient descent (DP-SGD), this cost is not borne equally: accuracy of DP models drops much more for the underrepresented classes and subgroups. For example, a gender classification model trained using DP-SGD exhibits much lower accuracy for black faces than for white faces. Critically, this gap is bigger in the DP model than in the non-DP model, i.e., if the original model is unfair, the unfairness becomes worse once DP is applied. We demonstrate this effect for a variety of tasks and models, including sentiment analysis of text and image classification. We then explain why DP training mechanisms such as gradient clipping and noise addition have disproportionate effect on the underrepresented and more complex subgroups, resulting in a disparate reduction of model accuracy.

Abstract PDF Upgrade to Chat

Authors (2)

Citations (425)

View on Semantic Scholar

Summary

The paper demonstrates that DP-SGD significantly reduces model accuracy for underrepresented subgroups, thereby exacerbating fairness issues in machine learning models.
The study identifies gradient clipping and noise addition as primary factors that disrupt learning processes for underrepresented classes, leading to pronounced accuracy disparities.
The research underscores the urgent need for algorithmic innovations that balance differential privacy with fairness to improve performance on imbalanced datasets.

Differential Privacy and the Disparate Impact on Model Accuracy

The paper, "Differential Privacy Has Disparate Impact on Model Accuracy," explores the nuanced intersection of differential privacy (DP) and its implications for machine learning models, particularly the disparate accuracy outcomes across different subgroups. Authored by Eugene Bagdasaryan and Vitaly Shmatikov, the work investigates how differential privacy mechanisms, specifically differentially private stochastic gradient descent (DP-SGD), exacerbate performance inequalities in models trained on imbalanced datasets.

Differential privacy has become a staple in privacy-preserving machine learning, offering a mechanism to bound the contribution of individual data points in training datasets. However, the paper uncovers a critical collateral effect: DP's accuracy reduction is not uniformly distributed across classes or subgroups. Instead, it disproportionately impacts underrepresented and more complex subgroups within the data, leading to increased disparity in model fairness. This phenomenon was empirically demonstrated across a spectrum of tasks, including gender classification on facial images, sentiment analysis of tweets, and species classification on the iNaturalist dataset, to name a few.

Core Findings and Contribution

The authors highlight the significant amplification of existing model biases when employing DP-SGD in scenarios with imbalanced data. For example, DP models performing gender classification exhibited reduced accuracy for black faces compared to white faces, an unfairness amplified relative to non-DP models. Similarly, in tasks such as sentiment analysis and species classification, accuracy reductions were more pronounced in subsets with complex or rare data.

Key processes within DP-SGD, such as gradient clipping and noise addition, are identified as primary contributors to these disparities. These mechanisms disrupt the learning processes of underrepresented classes by dampening significant gradients, leading to poorer model performance in these categories. The research explored this effect through extensive experimentation, utilizing models such as ResNet18 for facial recognition and LSTM for language tasks, and demonstrated consistent patterns of accuracy loss in smaller or less frequent classes.

Implications and Future Directions

The paper raises important considerations for the development of fair and privacy-preserving deep learning models. The findings suggest that traditional DP implementations inadvertently exacerbate existing biases, complicating efforts to achieve equitable model performance. This highlights the necessity for new methodological advancements that simultaneously address privacy preservation and fairness in machine learning.

Future research could explore novel algorithmic frameworks to harmonize differential privacy with fairness objectives. Incorporating adaptive mechanisms to balance gradient clipping and noise addition more effectively could mitigate the disparate impact observed. Additionally, expanding studies to investigate the interplay between DP and various fairness metrics across diverse datasets and model architectures would further illuminate pathways to equitable AI.

The integration of both privacy and fairness is crucial as machine learning technologies become increasingly pervasive. Ensuring robust, unbiased, and privacy-respecting models will necessitate a concerted effort from the research community, potentially leading to revisions of current DP standards and practices.

In summary, this paper provides a thorough investigation into the unintended consequence of increased subgroup disparity due to differential privacy, urging a reevaluation of how these privacy techniques are applied in practice. While the scalability and efficacy of DP in safeguarding individual data points remain essential, optimizing these frameworks for fairness will be paramount in advancing the field towards more inclusive AI solutions.

Markdown Report Issue