Data Amplification: A Unified and Competitive Approach to Property Estimation
Yi Hao, Alon Orlitsky, Ananda T. Suresh, Yihong Wu

TL;DR
This paper introduces a unified, efficient property estimator that significantly reduces sample complexity, effectively amplifying data utility across various distribution properties in statistical learning.
Contribution
It presents the first unified, linear-time property estimator that achieves empirical estimator performance with substantially fewer samples, enabling distribution-independent data amplification.
Findings
Performs comparably to empirical estimators with fewer samples
Outperforms existing estimators for many properties and distributions
Achieves near-optimal sample complexity for a wide class of properties
Abstract
Estimating properties of discrete distributions is a fundamental problem in statistical learning. We design the first unified, linear-time, competitive, property estimator that for a wide class of properties and for all underlying distributions uses just samples to achieve the performance attained by the empirical estimator with samples. This provides off-the-shelf, distribution-independent, "amplification" of the amount of data available relative to common-practice estimators. We illustrate the estimator's practical advantages by comparing it to existing estimators for a wide variety of properties and distributions. In most cases, its performance with samples is even as good as that of the empirical estimator with samples, and for essentially all properties, its performance is comparable to that of the best existing estimator designed specifically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Statistical Methods and Inference · Machine Learning and Data Classification
