Nonparametric extensions of randomized response for private confidence sets (2202.08728v4)
Abstract: This work derives methods for performing nonparametric, nonasymptotic statistical inference for population means under the constraint of local differential privacy (LDP). Given bounded observations $(X_1, \dots, X_n)$ with mean $\mu\star$ that are privatized into $(Z_1, \dots, Z_n)$, we present confidence intervals (CI) and time-uniform confidence sequences (CS) for $\mu\star$ when only given access to the privatized data. To achieve this, we study a nonparametric and sequentially interactive generalization of Warner's famous ``randomized response'' mechanism, satisfying LDP for arbitrary bounded random variables, and then provide CIs and CSs for their means given access to the resulting privatized observations. For example, our results yield private analogues of Hoeffding's inequality in both fixed-time and time-uniform regimes. We extend these Hoeffding-type CSs to capture time-varying (non-stationary) means, and conclude by illustrating how these methods can be used to conduct private online A/B tests.
- Domain compression and its application to randomness-optimal distributed goodness-of-fit. In Proceedings of Thirty Third Conference on Learning Theory, volume 125, pages 3–40. PMLR, 2020a.
- Inference under information constraints I: Lower bounds from Chi-square contraction. IEEE Transactions on Information Theory, 66(12):7835–7855, 2020b.
- Inference under information constraints III: Local privacy constraints. IEEE Journal on Selected Areas in Information Theory, 2(1):253–267, 2021a.
- Differentially private Assouad, Fano, and Le Cam. In Algorithmic Learning Theory, pages 48–78. PMLR, 2021b.
- Interactive inference under information constraints. IEEE Transactions on Information Theory, 68(1):502–516, 2022.
- Pan-private uniformity testing. In Proceedings of Thirty Third Conference on Learning Theory, volume 125, pages 183–218. PMLR, 2020.
- Apple Inc. Differential privacy overview. https://www.apple.com/privacy/docs/Differential_Privacy_Overview.pdf, 2022. Accessed: 2022-02-01.
- Tuning bandit algorithms in stochastic environments. In International conference on algorithmic learning theory, pages 150–165. Springer, 2007.
- Differentially private uniformly most powerful tests for binomial data. Advances in Neural Information Processing Systems, 31:4208–4218, 2018.
- Fisher information under local differential privacy. IEEE Journal on Selected Areas in Information Theory, 1(3):645–659, 2020.
- An electronic digital computor using cold cathode counting tubes for storage. Electronic Engineering, 23:286–91, 1951.
- Vidmantas Bentkus. On Hoeffding’s inequalities. The Annals of Probability, 32(2):1650–1673, 2004.
- Locally private non-asymptotic testing of discrete distributions is faster using interactive mechanisms. In Advances in Neural Information Processing Systems, volume 33, pages 3164–3173, 2020.
- Tom Berrett and Yi Yu. Locally private online change point detection. Advances in Neural Information Processing Systems, 34, 2021.
- Locally differentially private estimation of functionals of discrete distributions. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 24753–24764. Curran Associates, Inc., 2021.
- Local differential privacy: Elbow effect in optimal density estimation and adaptation over Besov ellipsoids. Bernoulli, 26(3):1727–1764, 2020.
- The structure of optimal private tests for simple hypotheses. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 310–321, 2019.
- Differentially private nonparametric hypothesis testing. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pages 737–751, 2019.
- Unbiased statistical estimation and valid confidence intervals under differential privacy. arXiv preprint arXiv:2110.14465, 2021.
- Confidence sequences for mean, variance, and median. Proceedings of the National Academy of Sciences of the United States of America, 58(1):66, 1967.
- Collecting telemetry data privately. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pages 3574–3583, 2017.
- Non-parametric differentially private confidence intervals for the median. arXiv preprint arXiv:2106.10333, 2021.
- The right complexity measure in locally private estimation: It is not the Fisher information. arXiv preprint arXiv:1806.05756, 2018.
- Local privacy and statistical minimax rates. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pages 429–438. IEEE, 2013a.
- Local privacy and minimax bounds: sharp rates for probability estimation. In Proceedings of the 26th International Conference on Neural Information Processing Systems-Volume 1, pages 1529–1537, 2013b.
- Minimax optimal procedures for locally private estimation. Journal of the American Statistical Association, 113(521):182–201, 2018.
- The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., 9(3-4):211–407, 2014.
- Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pages 265–284. Springer, 2006.
- Exponential inequalities for martingales with applications. Electronic Journal of Probability, 20:1–22, 2015.
- Parametric bootstrap for differentially private confidence intervals. In International Conference on Artificial Intelligence and Statistics, pages 1598–1618. PMLR, 2022.
- George E Forsythe. Reprint of a note on rounding-off errors. SIAM review, 1(1):66, 1959.
- Differentially private chi-squared hypothesis testing: Goodness of fit and independence testing. In International conference on machine learning, pages 2111–2120. PMLR, 2016.
- Locally private mean estimation: z𝑧zitalic_z-test and tight confidence intervals. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2545–2554. PMLR, 2019.
- Safe testing. arXiv preprint arXiv:1906.07801, 2019.
- Wassily Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13–30, 1963.
- A generalization of sampling without replacement from a finite universe. Journal of the American statistical Association, 47(260):663–685, 1952.
- Time-uniform Chernoff bounds via nonnegative supermartingales. Probability Surveys, 17:257–317, 2020.
- Time-uniform, nonparametric, nonasymptotic confidence sequences. The Annals of Statistics, 49(2):1055–1080, 2021.
- Tests of probabilistic models for propagation of roundoff errors. Communications of the ACM, 9(2):108–113, 1966.
- Peeking at A/B tests: Why it matters, and what to do about it. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1517–1525, 2017.
- Locally private Gaussian estimation. Advances in Neural Information Processing Systems, 32:2984–2993, 2019.
- Parameter-free online convex optimization with sub-exponential noise. In Conference on Learning Theory, pages 1802–1823. PMLR, 2019.
- Extremal mechanisms for local differential privacy. Advances in neural information processing systems, 27, 2014.
- Discrete distribution estimation under local privacy. In International Conference on Machine Learning, pages 2436–2444. PMLR, 2016.
- Private mean estimation of heavy-tailed distributions. In Conference on Learning Theory, pages 2204–2235. PMLR, 2020.
- Finite sample differentially private confidence intervals. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018.
- What can we learn privately? SIAM Journal on Computing, 40(3):793–826, 2011.
- Near-optimal confidence sequences for bounded random variables. In International Conference on Machine Learning, pages 5827–5837. PMLR, 2021.
- Estimating numerical distributions under local differential privacy. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pages 621–635, 2020.
- Empirical Bernstein bounds and sample variance penalization. In Conference on Learning Theory, pages 2372–2387. PMLR, 2009.
- Ilya Mironov. On significance of the least significant bits for differential privacy. In Proceedings of the 2012 ACM conference on Computer and communications security, pages 650–661, 2012.
- Tight concentrations and confidence sequences from the regret of universal portfolio. arXiv preprint arXiv:2110.14099, 2021.
- Testing exchangeability: Fork-convexity, supermartingales and e-processes. International Journal of Approximate Reasoning, 2021.
- Herbert Robbins. Statistical methods related to the law of the iterated logarithm. The Annals of Mathematical Statistics, 41(5):1397–1409, 1970.
- Estimation of regression coefficients when some regressors are not always observed. Journal of the American statistical Association, 89(427):846–866, 1994.
- Glenn Shafer. Testing by betting: A strategy for statistical and scientific communication. Journal of the Royal Statistical Society: Series A (Statistics in Society), 184(2):407–431, 2021.
- Test martingales, Bayes factors and p-values. Statistical Science, 26(1):84–101, 2011.
- Judith ter Schure and Peter Grünwald. ALL-IN meta-analysis: breathing life into living systematic reviews. arXiv preprint arXiv:2109.12141, 2021.
- Jean Ville. Etude critique de la notion de collectif. Bull. Amer. Math. Soc, 45(11):824, 1939.
- E-values: Calibration, combination and applications. The Annals of Statistics, 49(3):1736–1754, 2021.
- Differential privacy for clinical trial data: Preliminary evaluations. In 2009 IEEE International Conference on Data Mining Workshops, pages 138–143. IEEE, 2009.
- Abraham Wald. Sequential tests of statistical hypotheses. The annals of mathematical statistics, 16(2):117–186, 1945.
- Collecting and analyzing multidimensional data with local differential privacy. In 2019 IEEE 35th International Conference on Data Engineering (ICDE), pages 638–649. IEEE, 2019.
- False discovery rate control with e-values. Journal of the Royal Statistical Society, Series B, 2022.
- Differentially private algorithms for statistical verification of cyber-physical systems. IEEE Open Journal of Control Systems, 1:294–305, 2022.
- Differentially private hypothesis testing, revisited. arXiv preprint arXiv:1511.03376, 2015.
- Stanley L Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63–69, 1965.
- A statistical framework for differential privacy. Journal of the American Statistical Association, 105(489):375–389, 2010.
- Confidence sequences for sampling without replacement. Advances in Neural Information Processing Systems, 33, 2020.
- Estimating means of bounded random variables by betting. Journal of the Royal Statistical Society, Series B (to appear with discussion), 2023.
- Anytime-valid off-policy inference for contextual bandits. arXiv preprint arXiv:2210.10768, 2022.
- Adaptive concentration inequalities for sequential decision problems. Advances in Neural Information Processing Systems, 29, 2016.