A Statistical Perspective on Coreset Density Estimation
Paxton Turner, Jingbo Liu, Philippe Rigollet

TL;DR
This paper develops a statistical framework for coreset-based density estimation, establishing minimax rates and demonstrating near-optimal performance of practical coreset kernel density estimators for smooth densities.
Contribution
It introduces a theoretical analysis of coreset density estimators, including minimax rate characterization and optimality results for practical methods.
Findings
Coreset estimators achieve the minimax rate of density estimation.
Practical coreset kernel density estimators are near-minimax optimal.
The framework applies to a large class of smooth densities.
Abstract
Coresets have emerged as a powerful tool to summarize data by selecting a small subset of the original observations while retaining most of its information. This approach has led to significant computational speedups but the performance of statistical procedures run on coresets is largely unexplored. In this work, we develop a statistical framework to study coresets and focus on the canonical task of nonparameteric density estimation. Our contributions are twofold. First, we establish the minimax rate of estimation achievable by coreset-based estimators. Second, we show that the practical coreset kernel density estimators are near-minimax optimal over a large class of H\"{o}lder-smooth densities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Markov Chains and Monte Carlo Methods
MethodsCoresets
