Better and Simpler Lower Bounds for Differentially Private Statistical Estimation (2310.06289v2)
Abstract: We provide optimal lower bounds for two well-known parameter estimation (also known as statistical estimation) tasks in high dimensions with approximate differential privacy. First, we prove that for any $\alpha \le O(1)$, estimating the covariance of a Gaussian up to spectral error $\alpha$ requires $\tilde{\Omega}\left(\frac{d{3/2}}{\alpha \varepsilon} + \frac{d}{\alpha2}\right)$ samples, which is tight up to logarithmic factors. This result improves over previous work which established this for $\alpha \le O\left(\frac{1}{\sqrt{d}}\right)$, and is also simpler than previous work. Next, we prove that estimating the mean of a heavy-tailed distribution with bounded $k$th moments requires $\tilde{\Omega}\left(\frac{d}{\alpha{k/(k-1)} \varepsilon} + \frac{d}{\alpha2}\right)$ samples. Previous work for this problem was only able to establish this lower bound against pure differential privacy, or in the special case of $k = 2$. Our techniques follow the method of fingerprinting and are generally quite simple. Our lower bound for heavy-tailed estimation is based on a black-box reduction from privately estimating identity-covariance Gaussians. Our lower bound for covariance estimation utilizes a Bayesian approach to show that, under an Inverse Wishart prior distribution for the covariance matrix, no private estimator can be accurate even in expectation, without sufficiently many samples.
- On the sample complexity of privately learning unbounded high-dimensional gaussians. In Proceedings of the 32nd International Conference on Algorithmic Learning Theory, ALT ’21, pages 185–216. JMLR, Inc., 2021.
- Deep learning with differential privacy. In Conference on Computer and Communications Security (CCS), pages 308–318. ACM, 2016.
- Differentially private covariance estimation. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
- Privately estimating a Gaussian: Efficient, robust and optimal. In Proceedings of the 55th Annual ACM Symposium on the Theory of Computing, STOC ’23, New York, NY, USA, 2023. ACM.
- Private and polynomial time algorithms for learning gaussians and beyond. In Conference on Learning Theory, pages 1075–1076. PMLR, 2022.
- Privacy and statistical risk: Formalisms and minimax bounds. CoRR, abs/1412.4451, 2014.
- Private stochastic convex optimization with optimal rates. In Advances in Neural Information Processing Systems (NeurIPS), pages 11279–11288, 2019.
- Covariance-aware private mean estimation without private covariance estimation. In Advances in Neural Information Processing Systems, pages 7950–7964, 2021.
- Fast, sample-efficient, affine-invariant private mean and covariance estimation for subgaussian distributions. In Proceedings of the 36th Annual Conference on Learning Theory, COLT ’23, pages 5578–5579, 2023.
- Private hypothesis selection. IEEE Trans. Inf. Theory, 67(3):1981–2000, 2021.
- Make up your mind: The price of online queries in differential privacy. J. Priv. Confidentiality, 9(1), 2019.
- Fingerprinting codes and the price of approximate differential privacy. In Symposium on Theory of Computing, pages 1–10. ACM, 2014.
- Condition numbers of gaussian random matrices. SIAM J. Matrix Anal. Appl., 27(3):603––620, 2005.
- Mean estimation with user-level privacy under data heterogeneity. In Advances in Neural Information Processing Systems, volume 35, pages 29139–29151, 2022.
- The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy. The Annals of Statistics, 49(5):2825––2850, 2023.
- Score attack: A lower bound technique for optimal differentially private learning. CoRR, abs/2303.07152, 2023.
- A pretty fast algorithm for adaptive private mean estimation. In Proceedings of the 36th Annual Conference on Learning Theory, COLT ’23, pages 2511–2551, 2023.
- Statistical query lower bounds for robust estimation of high-dimensional gaussians and gaussian mixtures. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS ‘17, pages 73–84, 2017.
- Differentially private covariance revisited. In Advances in Neural Information Processing Systems, volume 35, pages 850–861, 2022.
- Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography Conference (TCC), volume 3876 of Lecture Notes in Computer Science, pages 265–284, 2006.
- Efficient algorithms for privately releasing marginals via convex relaxations. Discret. Comput. Geom., 53(3):650–673, 2015.
- Robust traceability from trace amounts. In IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS ‘15, pages 650–669. IEEE Computer Society, 2015.
- Efficient mean estimation with pure differential privacy via a sum-of-squares exponential mechanism. In Proceedings of the 54th Annual ACM Symposium on the Theory of Computing, STOC ’22, New York, NY, USA, 2022. ACM.
- Robustness implies privacy in statistical estimation. In Proceedings of the 55th Annual ACM Symposium on the Theory of Computing, STOC ’23, New York, NY, USA, 2023. ACM.
- Instance-optimal mean estimation under differential privacy. In Advances in Neural Information Processing Systems, pages 25993–26004, 2021.
- On the geometry of differential privacy. In Proceedings of the 42nd Annual ACM Symposium on the Theory of Computing, STOC ’10, pages 705–714. ACM, 2010.
- Preventing false discovery in interactive data analysis is hard. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS ‘14, pages 454–463. IEEE Computer Society, 2014.
- Privately learning high-dimensional distributions. In Proceedings of the 32nd Annual Conference on Learning Theory, COLT ’19, pages 1853–1902, 2019.
- New lower bounds for private estimation and a generalized fingerprinting lemma. In Advances in Neural Information Processing Systems 35, NeurIPS ’22, 2022.
- A private and computationally-efficient estimator for unbounded gaussians. In Proceedings of the 35th Annual Conference on Learning Theory, COLT ’22, pages 544–572, 2022.
- Private robust estimation by stabilizing convex relaxations. In Proceedings of the 35th Annual Conference on Learning Theory, COLT ’22, pages 723–777, 2022.
- The price of privately releasing contingency tables and the spectra of random matrices with correlated rows. In Symposium on Theory of Computing, (STOC), pages 775–784. ACM, 2010.
- Private mean estimation of heavy-tailed distributions. In Proceedings of the 33rd Annual Conference on Learning Theory, COLT ’20, pages 2204–2235, 2020.
- A primer on private statistics. CoRR, abs/2005.00010, 2020.
- Finite sample differentially private confidence intervals. In Proceedings of the 9th Conference on Innovations in Theoretical Computer Science, ITCS ’18, pages 44:1–44:9, Dagstuhl, Germany, 2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
- Robust and differentially private mean estimation. In Advances in Neural Information Processing Systems 34, NeurIPS ’21. Curran Associates, Inc., 2021.
- Differential privacy and robust statistics in high dimensions. In Proceedings of the 35th Annual Conference on Learning Theory, COLT ’22, pages 1167–1246, 2022.
- Learning with user-level privacy. In Advances in Neural Information Processing Systems, pages 12466–12479, 2021.
- Shyam Narayanan. Deterministic o(1)-approximation algorithms to 1-center clustering with outliers. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, volume 116 of LIPIcs, pages 21:1–21:19, 2018.
- Tight and robust private mean estimation with few users. In International Conference on Machine Learning, pages 16383–16412. PMLR, 2022.
- The geometry of differential privacy: the sparse and approximate cases. In Proceedings of the 45th Annual ACM Symposium on the Theory of Computing, STOC ’13, pages 351–360. ACM, 2013.
- Smooth lower bounds for differentially private algorithms via padding-and-permuting fingerprinting codes. CoRR, abs/2307.07604, 2023.
- Interactive fingerprinting codes and the hardness of preventing false discovery. In Proceedings of The 28th Conference on Learning Theory, volume 40 of COLT ‘15, pages 1588–1628. JMLR.org, 2015.
- Between pure and approximate differential privacy. J. Priv. Confidentiality, 7(2), 2016.
- Tight lower bounds for differentially private selection. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS ‘17, pages 552–563. IEEE Computer Society, 2017.
- Friendlycore: Practical differentially private aggregation. In Proceedings of the 39th International Conference on Machine Learning, ICML ’22, pages 21828–21863. JMLR, Inc., 2022.
- Roman Vershynin. High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2018.