Emergent Mind

Practical Differentially Private Hyperparameter Tuning with Subsampling

(2301.11989)
Published Jan 27, 2023 in cs.LG and cs.CR

Abstract

Tuning the hyperparameters of differentially private (DP) ML algorithms often requires use of sensitive data and this may leak private information via hyperparameter values. Recently, Papernot and Steinke (2022) proposed a certain class of DP hyperparameter tuning algorithms, where the number of random search samples is randomized itself. Commonly, these algorithms still considerably increase the DP privacy parameter $\varepsilon$ over non-tuned DP ML model training and can be computationally heavy as evaluating each hyperparameter candidate requires a new training run. We focus on lowering both the DP bounds and the computational cost of these methods by using only a random subset of the sensitive data for the hyperparameter tuning and by extrapolating the optimal values to a larger dataset. We provide a R\'enyi differential privacy analysis for the proposed method and experimentally show that it consistently leads to better privacy-utility trade-off than the baseline method by Papernot and Steinke.

Overview

  • The paper presents methods for hyperparameter tuning in differential privacy (DP) machine learning models that lower computational and privacy costs.

  • Hyperparameter tuning challenges in DP ML, including for Differentially Private Stochastic Gradient Descent (DP-SGD), are addressed by using a subsampling strategy and tailored privacy loss calculations.

  • Experimental results demonstrate that the proposed methods improve privacy-utility trade-offs and reduce gradient evaluations, resulting in computational savings.

  • A framework is developed to adjust hyperparameters impacting DP guarantees, with strategies for binding the search space and establishing uniform RDP bounds.

Introduction

Hyperparameter tuning constitutes a critical stage in the development of ML models, ensuring optimal performance. However, when dealing with differentially private (DP) ML models, the process can prove challenging. The paper in discussion propels this conversation forward by presenting methods that focus on reducing both the computational and the privacy cost associated with hyperparameter tuning in DP ML models. The work is rooted in the premise that privacy preservation must encompass not just the model parameters, but also the hyperparameters, and offers an analysis within the Rényi Differential Privacy (RDP) framework.

Hyperparameter Tuning in DP ML Models

Hyperparameter tuning for DP ML models, particularly for Differentially Private Stochastic Gradient Descent (DP-SGD), typically comes with two main costs: computational, due to the numerous evaluations of hyperparameter candidates, and privacy-related, as a result of increase in the privacy budget ε. This challenge is tackled by proposing a selective usage of a random subset of the dataset for hyperparameter tuning, with an extrapolation of the optimal values for larger datasets. Additionally, this approach leverages existing RDP analyses of black-box tuning algorithms and tailors the privacy loss calculations to the subsampling strategy used.

Numerical Results and Insights

Experimentally validating the proposed method, the authors showcase its application on standard datasets, demonstrating improved privacy-utility trade-offs compared to established baseline methods. For instance, the tailored RDP analysis corroborates that using a smaller subset helps in managing lower DP privacy leakage, consequently yielding more favorable utility without exhaustive computational expense. Importantly, the paper quantifies these improvements, revealing that their subsampling approach can lead to up to 6-8 times fewer gradient evaluations compared to baselines, translating into significant computational savings.

Hyperparameters Impacting DP Guarantees

In the realm of DP-SGD, hyperparameters like the noise level and the sampling ratio are intricate to the DP guarantees themselves. To handle this, the paper pioneers a framework that adjusts these hyperparameters to meet pre-defined DP targets. Through an evaluation grid or random sampling methodologies, the authors elucidate how to bind the search space and establish uniform RDP bounds for various candidate models, thus underpinning a harmonized yet efficient tuning process.

Conclusion and Future Work

The paper concludes by highlighting its contributions to the reduced computational and privacy costs in DP hyperparameter tuning. Pertinent algorithmic innovations are complemented by compelling experimental results, presenting a robust argument for the viability of the proposed methods. The overarching theme is a methodical synthesis of classical tuning with the specifics of privacies considerations. Furthermore, the researchers encourage future studies to expand these principles to other optimizers, potentially augmenting rigorous privacy analyses and developing generalized hyperparameter extrapolation techniques.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.