Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

General Gaussian Noise Mechanisms and Their Optimality for Unbiased Mean Estimation (2301.13850v2)

Published 31 Jan 2023 in math.ST, cs.CR, cs.DS, cs.LG, stat.ML, and stat.TH

Abstract: We investigate unbiased high-dimensional mean estimators in differential privacy. We consider differentially private mechanisms whose expected output equals the mean of the input dataset, for every dataset drawn from a fixed bounded $d$-dimensional domain $K$. A classical approach to private mean estimation is to compute the true mean and add unbiased, but possibly correlated, Gaussian noise to it. In the first part of this paper, we study the optimal error achievable by a Gaussian noise mechanism for a given domain $K$ when the error is measured in the $\ell_p$ norm for some $p \ge 2$. We give algorithms that compute the optimal covariance for the Gaussian noise for a given $K$ under suitable assumptions, and prove a number of nice geometric properties of the optimal error. These results generalize the theory of factorization mechanisms from domains $K$ that are symmetric and finite (or, equivalently, symmetric polytopes) to arbitrary bounded domains. In the second part of the paper we show that Gaussian noise mechanisms achieve nearly optimal error among all private unbiased mean estimation mechanisms in a very strong sense. In particular, for every input dataset, an unbiased mean estimator satisfying concentrated differential privacy introduces approximately at least as much error as the best Gaussian noise mechanism. We extend this result to local differential privacy, and to approximate differential privacy, but for the latter the error lower bound holds either for a dataset or for a neighboring dataset, and this relaxation is necessary.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. Instance-optimality in differential privacy via approximate inverse sensitivity mechanisms. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
  2. Near instance-optimality in differential privacy. CoRR, abs/2005.10630, 2020.
  3. Optimal algorithms for mean estimation under local differential privacy. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pages 1046–1056. PMLR, 2022.
  4. PLAN: variance-aware private mean estimation. CoRR, abs/2306.08745, 2023.
  5. Approximating the cut-norm via Grothendieck’s inequality. In ACM Symposium on Theory of Computing, pages 72–80, 2004.
  6. T. Ando. Concavity of certain maps on positive definite matrices and applications to Hadamard products. Linear Algebra Appl., 26:203–241, 1979.
  7. Towards instance-optimal private query release. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms. SODA 2019, pages 2480–2497. SIAM, Philadelphia, PA, 2019.
  8. Unconditional differentially private mechanisms for linear queries. In Howard J. Karloff and Toniann Pitassi, editors, Proceedings of the 44th Symposium on Theory of Computing Conference, STOC 2012, New York, NY, USA, May 19 - 22, 2012, pages 1269–1284. ACM, 2012.
  9. A framework for quadratic form maximization over convex sets through nonconvex relaxations. In Samir Khuller and Virginia Vassilevska Williams, editors, STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 870–881. ACM, 2021.
  10. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Theory of Cryptography - 14th International Conference, TCC 2016-B, Beijing, China, October 31 - November 3, 2016, Proceedings, Part I, volume 9985 of Lecture Notes in Computer Science, pages 635–658, 2016.
  11. Private empirical risk minimization: Efficient algorithms and tight error bounds. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 464–473. IEEE Computer Society, 2014.
  12. Fingerprinting codes and the price of approximate differential privacy. In David B. Shmoys, editor, Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31 - June 03, 2014, pages 1–10. ACM, 2014.
  13. Eric Carlen. Trace inequalities and quantum entropy: an introductory course. In Entropy and the quantum, volume 529 of Contemp. Math., pages 73–140. Amer. Math. Soc., Providence, RI, 2010.
  14. Multi-epoch matrix factorization mechanisms for private machine learning. CoRR, abs/2211.06530, 2022.
  15. Minimum variance estimation without regularity assumptions. Ann. Math. Statistics, 22:581–586, 1951.
  16. Minimax optimal procedures for locally private estimation. J. Amer. Statist. Assoc., 113(521):182–201, 2018.
  17. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology - EUROCRYPT 2006, 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, May 28 - June 1, 2006, Proceedings, pages 486–503, 2006.
  18. Subset-based instance optimality in private estimation. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pages 7992–8014. PMLR, 2023.
  19. Differentially private covariance revisited. CoRR, abs/2205.14324, 2022.
  20. Calibrating noise to sensitivity in private data analysis. In Proceedings of the Third Conference on Theory of Cryptography, TCC’06, pages 265–284, Berlin, Heidelberg, 2006. Springer-Verlag.
  21. Efficient algorithms for privately releasing marginals via convex relaxations. Discrete Comput. Geom., 53(3):650–673, 2015.
  22. Concentrated differential privacy. CoRR, abs/1603.01887, 2016.
  23. The right complexity measure in locally private estimation: It is not the fisher information. CoRR, abs/1806.05756, 2018.
  24. Wei Dong and Ke Yi. A nearly instance-optimal differentially private mechanism for conjunctive queries. In PODS ’22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, pages 213–225. ACM, 2022.
  25. Measure theory and fine properties of functions. Studies in Advanced Mathematics. CRC Press, Boca Raton, FL, 1992.
  26. Limiting privacy breaches in privacy preserving data mining. In PODS, pages 211–222. ACM, 2003.
  27. The power of factorization mechanisms in local and central differential privacy. In STOC’20—Proceedings of the 52n Annual ACM SIGACT Symposium on Theory of Computing, pages 425–438. ACM, 2020.
  28. Alexandre Grothendieck. Résumé de la théorie métrique des produits tensoriels topologiques. Bol. Soc. Mat. Sao Paulo, 8(1-79):88, 1953.
  29. J. M. Hammersley. On estimating restricted parameters. J. Roy. Statist. Soc. Ser. B, 12:192–229; discussion, 230–240, 1950.
  30. Instance-optimal mean estimation under differential privacy. In Marc’Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan, editors, Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 25993–26004, 2021.
  31. Constant matters: Fine-grained complexity of differentially private continual observation using completely bounded norms. CoRR, abs/2202.11205, 2022.
  32. Almost tight error bounds on differentially private continual counting. CoRR, abs/2211.05006, 2022.
  33. What can we learn privately? In FOCS, pages 531–540. IEEE, Oct 25–28 2008.
  34. A bias-variance-privacy trilemma for statistical estimation. CoRR, abs/2301.13334, 2023.
  35. Grothendieck-type inequalities in combinatorial optimization. Comm. Pure Appl. Math., 65(7):992–1035, 2012.
  36. Hidetoshi Komiya. Elementary proof for Sion’s minimax theorem. Kodai Math. J., 11(1):5–7, 1988.
  37. The composition theorem for differential privacy. IEEE Trans. Inf. Theory, 63(6):4037–4049, 2017.
  38. J. L. Krivine. Théorèmes de factorisation dans les espaces réticulés. In Séminaire Maurey-Schwartz 1973–1974: Espaces Lpsuperscript𝐿𝑝L^{p}italic_L start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT, applications radonifiantes et géométrie des espaces de Banach, pages Exp. Nos. 22 et 23, 22. École Polytech., Paris, 1974.
  39. On the ’semantics’ of differential privacy: A bayesian formulation. J. Priv. Confidentiality, 6(1), 2014.
  40. Finite sample differentially private confidence intervals. In Anna R. Karlin, editor, 9th Innovations in Theoretical Computer Science Conference, ITCS 2018, January 11-14, 2018, Cambridge, MA, USA, volume 94 of LIPIcs, pages 44:1–44:9. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018.
  41. Optimizing linear counting queries under differential privacy. In Proceedings of the 29th ACM Symposium on Principles of Database Systems, PODS’10, pages 123–134. ACM, 2010.
  42. The matrix mechanism: optimizing linear counting queries under differential privacy. VLDB J., 24(6):757–781, 2015.
  43. A direct product theorem for discrepancy. In Proceedings of the 23rd Annual IEEE Conference on Computational Complexity, CCC 2008, 23-26 June 2008, College Park, Maryland, USA, pages 71–80. IEEE Computer Society, 2008.
  44. Optimizing error of high-dimensional statistical queries under differential privacy. Proc. VLDB Endow., 11(10):1206–1219, 2018.
  45. Factorization norms and hereditary discrepancy. Int. Math. Res. Not. IMRN, 2020(3):751–780, 2020.
  46. Private online prefix sums via optimal matrix factorizations. CoRR, abs/2202.08312, 2022.
  47. Instance-optimal differentially private estimation. CoRR, abs/2210.15819, 2022.
  48. Yu. Nesterov. Semidefinite relaxation and nonconvex quadratic optimization. Optim. Methods Softw., 9(1-3):141–160, 1998.
  49. Aleksandar Nikolov. New Computational Aspects of Discrepancy Theory. PhD thesis, Rutgers, The State University of New Jersey, 2014.
  50. Aleksandar Nikolov. Private query release via the johnson-lindenstrauss transform. In Proceedings of the 2023 ACM-SIAM Symposium on Discrete Algorithms, SODA 2023, Florence, Italy, January 22-25, 2023, pages 4982–5002. SIAM, 2023.
  51. Efficient rounding for the noncommutative Grothendieck inequality. Theory Comput., 10:257–295, 2014.
  52. The geometry of differential privacy: the sparse and approximate cases. In STOC’13—Proceedings of the 2013 ACM Symposium on Theory of Computing, pages 351–360. ACM, New York, 2013.
  53. The geometry of differential privacy: The small database and approximate cases. SIAM J. Comput., 45(2):575–616, 2016.
  54. Gilles Pisier. Grothendieck’s theorem for noncommutative C∗superscript𝐶∗C^{\ast}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT-algebras, with an appendix on Grothendieck’s constants. J. Functional Analysis, 29(3):397–415, 1978.
  55. R. Tyrrell Rockafellar. Convex analysis. Princeton Mathematical Series, No. 28. Princeton University Press, Princeton, N.J., 1970.
  56. Mathematical statistics with applications. Elsevier/Academic Press, Amsterdam, 2009.
  57. Maurice Sion. On general minimax theorems. Pacific J. Math., 8:171–176, 1958.
  58. N. Tomczak-Jaegermann. Banach-Mazur Distances and Finite-Dimensional Operator Ideals. Pitman Monographs and Surveys in Pure and Applied Mathematics 38. J. Wiley, New York, 1989.
  59. Roman Vershynin. High-dimensional probability, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2018. An introduction with applications in data science, With a foreword by Sara van de Geer.
  60. Unbiased estimators and their applications. Vol. 1, volume 263 of Mathematics and its Applications. Kluwer Academic Publishers, Dordrecht, 1993. Univariate case, Translated from the 1989 Russian original by L. E. Strautman and revised by the authors.
  61. Unbiased estimators and their applications. Vol. 2, volume 362 of Mathematics and its Applications. Kluwer Academic Publishers Group, Dordrecht, 1996. Multivariate case.
  62. Stanley L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63–69, 1965.
  63. An optimal and scalable matrix mechanism for noisy marginals under convex loss functions. CoRR, abs/2305.08175, 2023.
  64. Privacy and bias analysis of disclosure avoidance systems. CoRR, abs/2301.12204, 2023.
  65. Bias and variance of post-processing in differential privacy. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Virtual Event, February 2-9, 2021, pages 11177–11184. AAAI Press, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Aleksandar Nikolov (36 papers)
  2. Haohua Tang (2 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.