Sampling Correctors
Cl\'ement Canonne, Themis Gouleakis, Ronitt Rubinfeld

TL;DR
This paper introduces sampling correctors, algorithms that fix noisy samples based on distribution structure, connecting them to learning and testing algorithms, and demonstrating their efficiency in correcting monotonic distributions.
Contribution
It presents the concept of sampling correctors, explores their connection to distribution learning and testing, and develops efficient correction algorithms for monotone distributions under various access models.
Findings
Sampling correctors can be designed using proper learning algorithms.
Correctors can be more sample-efficient than learning algorithms for certain distribution families.
Efficient correction algorithms are developed for monotone distributions with different access models.
Abstract
In many situations, sample data is obtained from a noisy or imperfect source. In order to address such corruptions, this paper introduces the concept of a sampling corrector. Such algorithms use structure that the distribution is purported to have, in order to allow one to make "on-the-fly" corrections to samples drawn from probability distributions. These algorithms then act as filters between the noisy data and the end user. We show connections between sampling correctors, distribution learning algorithms, and distribution property testing algorithms. We show that these connections can be utilized to expand the applicability of known distribution learning and property testing algorithms as well as to achieve improved algorithms for those tasks. As a first step, we show how to design sampling correctors using proper learning algorithms. We then focus on the question of whether…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Data Quality and Management · Advanced Database Systems and Queries
