Approximability and Generalisation
Andrew J. Turner, Ata Kab\'an

TL;DR
This paper explores how the ability to approximate predictors influences their generalisation in machine learning, providing bounds, algorithms, and insights into the role of data and structure in learning efficiency.
Contribution
It introduces a framework linking approximability to generalisation, offering bounds, algorithms, and structural insights that improve understanding of compressed predictors in learning.
Findings
Approximable concepts can be learned with fewer labelled samples using unlabelled data.
Algorithms are proposed that ensure predictors and their approximations generalise well.
Structural properties in sensitivities can reduce or eliminate the need for additional unlabelled data.
Abstract
Approximate learning machines have become popular in the era of small devices, including quantised, factorised, hashed, or otherwise compressed predictors, and the quest to explain and guarantee good generalisation abilities for such methods has just begun. In this paper we study the role of approximability in learning, both in the full precision and the approximated settings of the predictor that is learned from the data, through a notion of sensitivity of predictors to the action of the approximation operator at hand. We prove upper bounds on the generalisation of such predictors, yielding the following main findings, for any PAC-learnable class and any given approximation operator. 1) We show that under mild conditions, approximable target concepts are learnable from a smaller labelled sample, provided sufficient unlabelled data. 2) We give algorithms that guarantee a good predictor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Domain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques
