Testing distributional assumptions of learning algorithms

Ronitt Rubinfeld; Arsen Vasilyan

arXiv:2204.07196·cs.LG·November 22, 2022

Testing distributional assumptions of learning algorithms

Ronitt Rubinfeld, Arsen Vasilyan

PDF

Open Access

TL;DR

This paper introduces a systematic model for designing tester-learner pairs that verify distributional assumptions before applying agnostic learning algorithms, demonstrated on Gaussian and uniform distributions with near-optimal runtimes.

Contribution

It proposes a new framework for constructing tester-learner pairs, applies it to Gaussian and uniform distributions, and explores the complexity gap in agnostic learning with distributional verification.

Findings

01

A tester-learner pair for Gaussian halfspaces with runtime $n^{ ilde{O}(1/ ext{epsilon}^4)}$

02

A tester-learner pair for uniform halfspaces with similar runtime

03

Existence of settings where tester-learner pairs require exponential time despite efficient agnostic learners

Abstract

There are many high dimensional function classes that have fast agnostic learning algorithms when assumptions on the distribution of examples can be made, such as Gaussianity or uniformity over the domain. But how can one be confident that data indeed satisfies such assumption, so that one can trust in output quality of the agnostic learning algorithm? We propose a model by which to systematically study the design of tester-learner pairs $(A, T)$ , such that if the distribution on examples in the data passes the tester $T$ then one can safely trust the output of the agnostic learner $A$ on the data. To demonstrate the power of the model, we apply it to the classical problem of agnostically learning halfspaces under the standard Gaussian distribution and present a tester-learner pair with combined run-time of $n^{\tilde{O} (1/ ϵ^{4})}$ . This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning