Replicability in High Dimensional Statistics

Max Hopkins; Russell Impagliazzo; Daniel Kane; Sihan Liu; Christopher; Ye

arXiv:2406.02628·stat.ML·June 6, 2024

Replicability in High Dimensional Statistics

Max Hopkins, Russell Impagliazzo, Daniel Kane, Sihan Liu, Christopher, Ye

PDF

Open Access

TL;DR

This paper explores the computational and statistical costs of achieving replicability in high-dimensional statistical tasks, establishing fundamental equivalences and providing efficient algorithms with optimal sample complexity bounds.

Contribution

It introduces a novel equivalence between replicable algorithms and high-dimensional isoperimetric tilings, and develops efficient algorithms for mean estimation and coin problems.

Findings

01

Established sample complexity bounds for high-dimensional mean estimation.

02

Resolved open problems in replicability and coin problem sample complexity.

03

Developed algorithms that match or improve upon existing efficiency bounds.

Abstract

The replicability crisis is a major issue across nearly all areas of empirical science, calling for the formal study of replicability in statistics. Motivated in this context, [Impagliazzo, Lei, Pitassi, and Sorrell STOC 2022] introduced the notion of replicable learning algorithms, and gave basic procedures for $1$ -dimensional tasks including statistical queries. In this work, we study the computational and statistical cost of replicability for several fundamental high dimensional statistical tasks, including multi-hypothesis testing and mean estimation. Our main contribution establishes a computational and statistical equivalence between optimal replicable algorithms and high dimensional isoperimetric tilings. As a consequence, we obtain matching sample complexity upper and lower bounds for replicable mean estimation of distributions with bounded covariance, resolving an open…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference