Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Better and Simpler Lower Bounds for Differentially Private Statistical Estimation (2310.06289v2)

Published 10 Oct 2023 in math.ST, cs.CR, cs.DS, cs.IT, cs.LG, math.IT, and stat.TH

Abstract: We provide optimal lower bounds for two well-known parameter estimation (also known as statistical estimation) tasks in high dimensions with approximate differential privacy. First, we prove that for any $\alpha \le O(1)$, estimating the covariance of a Gaussian up to spectral error $\alpha$ requires $\tilde{\Omega}\left(\frac{d{3/2}}{\alpha \varepsilon} + \frac{d}{\alpha2}\right)$ samples, which is tight up to logarithmic factors. This result improves over previous work which established this for $\alpha \le O\left(\frac{1}{\sqrt{d}}\right)$, and is also simpler than previous work. Next, we prove that estimating the mean of a heavy-tailed distribution with bounded $k$th moments requires $\tilde{\Omega}\left(\frac{d}{\alpha{k/(k-1)} \varepsilon} + \frac{d}{\alpha2}\right)$ samples. Previous work for this problem was only able to establish this lower bound against pure differential privacy, or in the special case of $k = 2$. Our techniques follow the method of fingerprinting and are generally quite simple. Our lower bound for heavy-tailed estimation is based on a black-box reduction from privately estimating identity-covariance Gaussians. Our lower bound for covariance estimation utilizes a Bayesian approach to show that, under an Inverse Wishart prior distribution for the covariance matrix, no private estimator can be accurate even in expectation, without sufficiently many samples.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. On the sample complexity of privately learning unbounded high-dimensional gaussians. In Proceedings of the 32nd International Conference on Algorithmic Learning Theory, ALT ’21, pages 185–216. JMLR, Inc., 2021.
  2. Deep learning with differential privacy. In Conference on Computer and Communications Security (CCS), pages 308–318. ACM, 2016.
  3. Differentially private covariance estimation. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  4. Privately estimating a Gaussian: Efficient, robust and optimal. In Proceedings of the 55th Annual ACM Symposium on the Theory of Computing, STOC ’23, New York, NY, USA, 2023. ACM.
  5. Private and polynomial time algorithms for learning gaussians and beyond. In Conference on Learning Theory, pages 1075–1076. PMLR, 2022.
  6. Privacy and statistical risk: Formalisms and minimax bounds. CoRR, abs/1412.4451, 2014.
  7. Private stochastic convex optimization with optimal rates. In Advances in Neural Information Processing Systems (NeurIPS), pages 11279–11288, 2019.
  8. Covariance-aware private mean estimation without private covariance estimation. In Advances in Neural Information Processing Systems, pages 7950–7964, 2021.
  9. Fast, sample-efficient, affine-invariant private mean and covariance estimation for subgaussian distributions. In Proceedings of the 36th Annual Conference on Learning Theory, COLT ’23, pages 5578–5579, 2023.
  10. Private hypothesis selection. IEEE Trans. Inf. Theory, 67(3):1981–2000, 2021.
  11. Make up your mind: The price of online queries in differential privacy. J. Priv. Confidentiality, 9(1), 2019.
  12. Fingerprinting codes and the price of approximate differential privacy. In Symposium on Theory of Computing, pages 1–10. ACM, 2014.
  13. Condition numbers of gaussian random matrices. SIAM J. Matrix Anal. Appl., 27(3):603––620, 2005.
  14. Mean estimation with user-level privacy under data heterogeneity. In Advances in Neural Information Processing Systems, volume 35, pages 29139–29151, 2022.
  15. The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy. The Annals of Statistics, 49(5):2825––2850, 2023.
  16. Score attack: A lower bound technique for optimal differentially private learning. CoRR, abs/2303.07152, 2023.
  17. A pretty fast algorithm for adaptive private mean estimation. In Proceedings of the 36th Annual Conference on Learning Theory, COLT ’23, pages 2511–2551, 2023.
  18. Statistical query lower bounds for robust estimation of high-dimensional gaussians and gaussian mixtures. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS ‘17, pages 73–84, 2017.
  19. Differentially private covariance revisited. In Advances in Neural Information Processing Systems, volume 35, pages 850–861, 2022.
  20. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography Conference (TCC), volume 3876 of Lecture Notes in Computer Science, pages 265–284, 2006.
  21. Efficient algorithms for privately releasing marginals via convex relaxations. Discret. Comput. Geom., 53(3):650–673, 2015.
  22. Robust traceability from trace amounts. In IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS ‘15, pages 650–669. IEEE Computer Society, 2015.
  23. Efficient mean estimation with pure differential privacy via a sum-of-squares exponential mechanism. In Proceedings of the 54th Annual ACM Symposium on the Theory of Computing, STOC ’22, New York, NY, USA, 2022. ACM.
  24. Robustness implies privacy in statistical estimation. In Proceedings of the 55th Annual ACM Symposium on the Theory of Computing, STOC ’23, New York, NY, USA, 2023. ACM.
  25. Instance-optimal mean estimation under differential privacy. In Advances in Neural Information Processing Systems, pages 25993–26004, 2021.
  26. On the geometry of differential privacy. In Proceedings of the 42nd Annual ACM Symposium on the Theory of Computing, STOC ’10, pages 705–714. ACM, 2010.
  27. Preventing false discovery in interactive data analysis is hard. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS ‘14, pages 454–463. IEEE Computer Society, 2014.
  28. Privately learning high-dimensional distributions. In Proceedings of the 32nd Annual Conference on Learning Theory, COLT ’19, pages 1853–1902, 2019.
  29. New lower bounds for private estimation and a generalized fingerprinting lemma. In Advances in Neural Information Processing Systems 35, NeurIPS ’22, 2022.
  30. A private and computationally-efficient estimator for unbounded gaussians. In Proceedings of the 35th Annual Conference on Learning Theory, COLT ’22, pages 544–572, 2022.
  31. Private robust estimation by stabilizing convex relaxations. In Proceedings of the 35th Annual Conference on Learning Theory, COLT ’22, pages 723–777, 2022.
  32. The price of privately releasing contingency tables and the spectra of random matrices with correlated rows. In Symposium on Theory of Computing, (STOC), pages 775–784. ACM, 2010.
  33. Private mean estimation of heavy-tailed distributions. In Proceedings of the 33rd Annual Conference on Learning Theory, COLT ’20, pages 2204–2235, 2020.
  34. A primer on private statistics. CoRR, abs/2005.00010, 2020.
  35. Finite sample differentially private confidence intervals. In Proceedings of the 9th Conference on Innovations in Theoretical Computer Science, ITCS ’18, pages 44:1–44:9, Dagstuhl, Germany, 2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
  36. Robust and differentially private mean estimation. In Advances in Neural Information Processing Systems 34, NeurIPS ’21. Curran Associates, Inc., 2021.
  37. Differential privacy and robust statistics in high dimensions. In Proceedings of the 35th Annual Conference on Learning Theory, COLT ’22, pages 1167–1246, 2022.
  38. Learning with user-level privacy. In Advances in Neural Information Processing Systems, pages 12466–12479, 2021.
  39. Shyam Narayanan. Deterministic o(1)-approximation algorithms to 1-center clustering with outliers. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, volume 116 of LIPIcs, pages 21:1–21:19, 2018.
  40. Tight and robust private mean estimation with few users. In International Conference on Machine Learning, pages 16383–16412. PMLR, 2022.
  41. The geometry of differential privacy: the sparse and approximate cases. In Proceedings of the 45th Annual ACM Symposium on the Theory of Computing, STOC ’13, pages 351–360. ACM, 2013.
  42. Smooth lower bounds for differentially private algorithms via padding-and-permuting fingerprinting codes. CoRR, abs/2307.07604, 2023.
  43. Interactive fingerprinting codes and the hardness of preventing false discovery. In Proceedings of The 28th Conference on Learning Theory, volume 40 of COLT ‘15, pages 1588–1628. JMLR.org, 2015.
  44. Between pure and approximate differential privacy. J. Priv. Confidentiality, 7(2), 2016.
  45. Tight lower bounds for differentially private selection. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS ‘17, pages 552–563. IEEE Computer Society, 2017.
  46. Friendlycore: Practical differentially private aggregation. In Proceedings of the 39th International Conference on Machine Learning, ICML ’22, pages 21828–21863. JMLR, Inc., 2022.
  47. Roman Vershynin. High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2018.
Citations (6)

Summary

We haven't generated a summary for this paper yet.