TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm
Yi Hao, Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar

TL;DR
This paper introduces TURF, a fast and near-optimal distribution learning algorithm that achieves the best possible approximation bounds for a wide class of distributions, improving over existing methods.
Contribution
The paper presents a new distribution learning algorithm that attains the optimal approximation factor of 2 for all cases except the simplest, and provides a method to estimate the best polynomial complexity for practical distributions.
Findings
Achieves the optimal approximation factor of 2 for distribution learning.
Provides a near-linear-time, sample-efficient estimator.
Demonstrates improved empirical performance over existing algorithms.
Abstract
Approximating distributions from their samples is a canonical statistical-learning problem. One of its most powerful and successful modalities approximates every distribution to an distance essentially at most a constant times larger than its closest -piece degree- polynomial, where and . Letting denote the smallest such factor, clearly , and it can be shown that for all other and . Yet current computationally efficient algorithms show only and the bound rises quickly to for . We derive a near-linear-time and essentially sample-optimal estimator that establishes for all . Additionally, for many practical distributions, the lowest approximation distance is achieved by polynomials with vastly varying number of pieces. We provide a method that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Algorithms and Data Compression
