Sample-Optimal Density Estimation in Nearly-Linear Time

Jayadev Acharya; Ilias Diakonikolas; Jerry Li; Ludwig Schmidt

arXiv:1506.00671·cs.DS·June 3, 2015·2 cites

Sample-Optimal Density Estimation in Nearly-Linear Time

Jayadev Acharya, Ilias Diakonikolas, Jerry Li, Ludwig Schmidt

PDF

Open Access

TL;DR

This paper introduces a nearly-linear time, sample-optimal algorithm for density estimation of univariate distributions approximated by piecewise polynomials, improving efficiency and accuracy over prior methods.

Contribution

The paper presents a unified, nearly-linear time algorithm for density estimation that is nearly optimal in sample complexity, applicable to various structured distribution families.

Findings

01

Achieves nearly optimal sample complexity for density estimation.

02

Runs in nearly-linear time, significantly faster than previous methods.

03

Performs well in practical experiments.

Abstract

We design a new, fast algorithm for agnostically learning univariate probability distributions whose densities are well approximated by piecewise polynomial functions. Let $f$ be the density function of an arbitrary univariate distribution, and suppose that $f$ is $OPT$ -close in $L_{1}$ -distance to an unknown piecewise polynomial function with $t$ interval pieces and degree $d$ . Our algorithm draws $n = O (t (d + 1) / ϵ^{2})$ samples from $f$ , runs in time $\tilde{O} (n \cdot poly (d))$ , and with probability at least $9/10$ outputs an $O (t)$ -piecewise degree- $d$ hypothesis $h$ that is $4 \cdot OPT + ϵ$ close to $f$ . Our general algorithm yields (nearly) sample-optimal and nearly-linear time estimators for a wide range of structured distribution families over both continuous and discrete domains in a unified way. For most of our applications, these are the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Machine Learning in Healthcare