Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 32 tok/s Pro
GPT-5 High 36 tok/s Pro
GPT-4o 129 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 442 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Privacy of the last iterate in cyclically-sampled DP-SGD on nonconvex composite losses (2407.05237v3)

Published 7 Jul 2024 in cs.LG, cs.DS, math.OC, stat.ML, and cs.CR

Abstract: Differentially-private stochastic gradient descent (DP-SGD) is a family of iterative machine learning training algorithms that privatize gradients to generate a sequence of differentially-private (DP) model parameters. It is also the standard tool used to train DP models in practice, even though most users are only interested in protecting the privacy of the final model. Tight DP accounting for the last iterate would minimize the amount of noise required while maintaining the same privacy guarantee and potentially increasing model utility. However, last-iterate accounting is challenging, and existing works require strong assumptions not satisfied by most implementations. These include assuming (i) the global sensitivity constant is known - to avoid gradient clipping; (ii) the loss function is Lipschitz or convex; and (iii) input batches are sampled randomly. In this work, we forego any unrealistic assumptions and provide privacy bounds for the most commonly used variant of DP-SGD, in which data is traversed cyclically, gradients are clipped, and only the last model is released. More specifically, we establish new Renyi differential privacy (RDP) upper bounds for the last iterate under realistic assumptions of small stepsize and Lipschitz smoothness of the loss function. Our general bounds also recover the special-case convex bounds when the weak-convexity parameter of the objective function approaches zero and no clipping is performed. The approach itself leverages optimal transport techniques for last iterate bounds, which is a nontrivial task when the data is traversed cyclically and the loss function is nonconvex.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: