Learners that Use Little Information
Raef Bassily, Shay Moran, Ido Nachum, Jonathan Shafer, Amir Yehudayoff

TL;DR
This paper investigates learning algorithms constrained to minimal information use, demonstrating their generalization ability, establishing sample complexity bounds, and exploring their connections to concepts like compression and privacy.
Contribution
Introduces the concept of d-bit information learners, analyzes their generalization and sample complexity, and connects them to existing notions like compression schemes and differential privacy.
Findings
d-bit information learners generalize well
Sample complexity bounds depend on confidence and error
Existence of low-information ERM for VC classes in distribution-dependent setting
Abstract
We study learning algorithms that are restricted to using a small amount of information from their input sample. We introduce a category of learning algorithms we term -bit information learners, which are algorithms whose output conveys at most bits of information of their input. A central theme in this work is that such algorithms generalize. We focus on the learning capacity of these algorithms, and prove sample complexity bounds with tight dependencies on the confidence and error parameters. We also observe connections with well studied notions such as sample compression schemes, Occam's razor, PAC-Bayes and differential privacy. We discuss an approach that allows us to prove upper bounds on the amount of information that algorithms reveal about their inputs, and also provide a lower bound by showing a simple concept class for which every (possibly randomized) empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Privacy-Preserving Technologies in Data
