Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages (2404.10201v2)

Published 16 Apr 2024 in cs.DS, cs.CR, cs.IT, cs.LG, and math.IT

Abstract: We study the problem of private vector mean estimation in the shuffle model of privacy where $n$ users each have a unit vector $v{(i)} \in\mathbb{R}d$. We propose a new multi-message protocol that achieves the optimal error using $\tilde{\mathcal{O}}\left(\min(n\varepsilon2,d)\right)$ messages per user. Moreover, we show that any (unbiased) protocol that achieves optimal error requires each user to send $\Omega(\min(n\varepsilon2,d)/\log(n))$ messages, demonstrating the optimality of our message complexity up to logarithmic factors. Additionally, we study the single-message setting and design a protocol that achieves mean squared error $\mathcal{O}(dn{d/(d+2)}\varepsilon{-4/(d+2)})$. Moreover, we show that any single-message protocol must incur mean squared error $\Omega(dn{d/(d+2)})$, showing that our protocol is optimal in the standard setting where $\varepsilon = \Theta(1)$. Finally, we study robustness to malicious users and show that malicious users can incur large additive error with a single shuffler.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 308–318, 2016.
  2. Fast optimal locally private mean estimation via random projections. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  3. Optimal algorithms for mean estimation under local differential privacy. In International Conference on Machine Learning, ICML, USA, pages 1046–1056, 2022.
  4. Communication complexity in locally private distribution estimation and heavy hitters. In Proceedings of the 36th International Conference on Machine Learning, ICML, pages 51–60, 2019.
  5. The privacy blanket of the shuffle model. In Advances in Cryptology - CRYPTO 2019 - 39th Annual International Cryptology Conference, Proceedings, Part II, pages 638–667, 2019.
  6. Private summation in the multi-message shuffle model. In CCS ’20: 2020 ACM SIGSAC Conference on Computer and Communications Security, pages 657–676, 2020.
  7. Protection against reconstruction and its applications in private federated learning. CoRR, abs/1812.00984, 2018.
  8. Prochlo: Strong privacy for analytics in the crowd. In Proceedings of the 26th Symposium on Operating Systems Principles, pages 441–459, 2017.
  9. On the round complexity of the shuffle model. In Theory of Cryptography Conference, pages 683–712. Springer, 2020.
  10. Distributed private data analysis: Simultaneously solving how and what. In Advances in Cryptology - CRYPTO 2008, 28th Annual International Cryptology Conference. Proceedings, pages 451–468, 2008.
  11. Local, private, efficient protocols for succinct histograms. In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC, pages 127–135, 2015.
  12. Private empirical risk minimization: Efficient algorithms and tight error bounds. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS, pages 464–473, 2014.
  13. Prio: Private, robust, and scalable computation of aggregate statistics. In Aditya Akella and Jon Howell, editors, 14th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2017, Boston, MA, USA, March 27-29, 2017, pages 259–282. USENIX Association, 2017.
  14. On distributed differential privacy and counting distinct elements. arXiv:2009.09604 [cs.CR], 2020.
  15. Shuffle private stochastic convex optimization. In The Tenth International Conference on Learning Representations, ICLR, 2022.
  16. Breaking the communication-privacy-accuracy trilemma. In Proceedings of the 33rd Annual Conference on Advances in Neural Information Processing Systems (NeurIPS), 2020.
  17. Differentially private empirical risk minimization. J. Mach. Learn. Res., 12:1069–1109, 2011.
  18. Privacy amplification via compression: Achieving the optimal privacy-accuracy-communication trade-off in distributed mean estimation. arXiv:2304.01541 [stat.ML], 2023.
  19. Optimal lower bound for differentially private multi-party aggregation. In Algorithms - ESA 2012 - 20th Annual European Symposium. Proceedings, pages 277–288, 2012.
  20. Distributed differential privacy via shuffling. In Advances in Cryptology - EUROCRYPT 2019 - 38th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Proceedings, Part I, pages 375–403, 2019.
  21. Manipulation attacks in local differential privacy. Journal of Privacy and Confidentiality, 11(1), Feb. 2021.
  22. The limits of pan privacy and shuffle privacy for learning and estimation. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 1081–1094, 2021.
  23. Privacy pass: Bypassing internet challenges anonymously. Proc. Priv. Enhancing Technol., 2018(3):164–180, 2018.
  24. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography, Third Theory of Cryptography Conference, TCC, Proceedings, pages 265–284, 2006.
  25. Calibrating noise to sensitivity in private data analysis. J. Priv. Confidentiality, 7(3):17–51, 2016.
  26. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., 9(3-4):211–407, 2014.
  27. Lower bounds for locally private estimation via communication complexity. In Conference on Learning Theory, COLT, pages 1161–1191, 2019.
  28. John C. Duchi. Introductory lectures on stochastic convex optimization. In The Mathematics of Data, IAS/Park City Mathematics Series. American Mathematical Society, 2018.
  29. Minimax optimal procedures for locally private estimation. CoRR, abs/1604.02390, 2016.
  30. Amplification by shuffling: From local to central differential privacy via anonymity. In Proceedings of the Thirtieth ACM-SIAM Symposium on Discrete Algorithms (SODA), 2019.
  31. Statistical query algorithms for mean vector estimation and stochastic convex optimization. Math. Oper. Res., 46(3):912–945, 2021.
  32. Building a RAPPOR with the unknown: Privacy-preserving learning of associations and data dictionaries. Proc. Priv. Enhancing Technol., 2016(3):41–61, 2016.
  33. Lossless compression of efficient private local randomizers. In Proceedings of the 38th International Conference on Machine Learning, volume 139, pages 3208–3219. PMLR, 2021.
  34. Differentially private aggregation in the shuffle model: Almost central accuracy in almost a single message. In Proceedings of the 38th International Conference on Machine Learning, ICML, pages 3692–3701, 2021.
  35. Private aggregation from fewer anonymous messages. In Advances in Cryptology - EUROCRYPT 2020 - 39th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Proceedings, Part II, pages 798–827, 2020.
  36. Rate-Limited Token Issuance Protocol. Internet-Draft draft-ietf-privacypass-rate-limit-tokens-01, Internet Engineering Task Force, March 2023. Work in Progress.
  37. Cryptography from anonymity. In 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2006), USA, Proceedings, pages 239–248. IEEE Computer Society, 2006.
  38. What can we learn privately? SIAM J. Comput., 40(3):793–826, 2011.
  39. Uncertainty principles and vector quantization. IEEE Trans. Inf. Theory, 56(7):3491–3501, 2010.
  40. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS, pages 1273–1282, 2017.
  41. Collecting and analyzing data from smart device users with local differential privacy. CoRR, abs/1606.05053, 2016.
  42. Pine: Efficient norm-bound verification for secret-shared vectors, 2023.
  43. Applying the shuffle model of differential privacy to vector aggregation. In Holger Pirk and Thomas Heinis, editors, Proceedings of the The British International Conference on Databases, volume 3163 of CEUR Workshop Proceedings, pages 50–59, 2021.
  44. Aggregation and transformation of vector-valued messages in the shuffle model of differential privacy. IEEE Trans. Inf. Forensics Secur., 17:612–627, 2022.
  45. Personalized privacy-preserving frequent itemset mining using randomized response. The Scientific World Journal, 2014, 2014.
  46. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 1310–1321. ACM, 2015.
  47. Kunal Talwar. Differential secrecy for distributed data and applications to robust differentially secure vector summation. In L. Elisa Celis, editor, 3rd Symposium on Foundations of Responsible Computing, FORC 2022, June 6-8, 2022, Cambridge, MA, USA, volume 218 of LIPIcs, pages 7:1–7:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
  48. Oblivious HTTP. Internet-Draft draft-ietf-ohai-ohttp-08, Internet Engineering Task Force, March 2023. Work in Progress.
  49. Optimal schemes for discrete distribution estimation under locally differential privacy. IEEE Trans. Inf. Theory, 64(8):5662–5676, 2018.
  50. Locally differentially private sparse vector aggregation. In 43rd IEEE Symposium on Security and Privacy, SP, pages 422–439, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Hilal Asi (29 papers)
  2. Vitaly Feldman (71 papers)
  3. Jelani Nelson (53 papers)
  4. Huy L. Nguyen (49 papers)
  5. Samson Zhou (77 papers)
  6. Kunal Talwar (83 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.