Replicability in High Dimensional Statistics
Max Hopkins, Russell Impagliazzo, Daniel Kane, Sihan Liu, Christopher, Ye

TL;DR
This paper explores the computational and statistical costs of achieving replicability in high-dimensional statistical tasks, establishing fundamental equivalences and providing efficient algorithms with optimal sample complexity bounds.
Contribution
It introduces a novel equivalence between replicable algorithms and high-dimensional isoperimetric tilings, and develops efficient algorithms for mean estimation and coin problems.
Findings
Established sample complexity bounds for high-dimensional mean estimation.
Resolved open problems in replicability and coin problem sample complexity.
Developed algorithms that match or improve upon existing efficiency bounds.
Abstract
The replicability crisis is a major issue across nearly all areas of empirical science, calling for the formal study of replicability in statistics. Motivated in this context, [Impagliazzo, Lei, Pitassi, and Sorrell STOC 2022] introduced the notion of replicable learning algorithms, and gave basic procedures for -dimensional tasks including statistical queries. In this work, we study the computational and statistical cost of replicability for several fundamental high dimensional statistical tasks, including multi-hypothesis testing and mean estimation. Our main contribution establishes a computational and statistical equivalence between optimal replicable algorithms and high dimensional isoperimetric tilings. As a consequence, we obtain matching sample complexity upper and lower bounds for replicable mean estimation of distributions with bounded covariance, resolving an open…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference
