Testing distributional assumptions of learning algorithms
Ronitt Rubinfeld, Arsen Vasilyan

TL;DR
This paper introduces a systematic model for designing tester-learner pairs that verify distributional assumptions before applying agnostic learning algorithms, demonstrated on Gaussian and uniform distributions with near-optimal runtimes.
Contribution
It proposes a new framework for constructing tester-learner pairs, applies it to Gaussian and uniform distributions, and explores the complexity gap in agnostic learning with distributional verification.
Findings
A tester-learner pair for Gaussian halfspaces with runtime $n^{ ilde{O}(1/ ext{epsilon}^4)}$
A tester-learner pair for uniform halfspaces with similar runtime
Existence of settings where tester-learner pairs require exponential time despite efficient agnostic learners
Abstract
There are many high dimensional function classes that have fast agnostic learning algorithms when assumptions on the distribution of examples can be made, such as Gaussianity or uniformity over the domain. But how can one be confident that data indeed satisfies such assumption, so that one can trust in output quality of the agnostic learning algorithm? We propose a model by which to systematically study the design of tester-learner pairs , such that if the distribution on examples in the data passes the tester then one can safely trust the output of the agnostic learner on the data. To demonstrate the power of the model, we apply it to the classical problem of agnostically learning halfspaces under the standard Gaussian distribution and present a tester-learner pair with combined run-time of . This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
