Average-Case Information Complexity of Learning
Ido Nachum, Amir Yehudayoff

TL;DR
This paper investigates the average information revealed by learning algorithms for concept classes with VC-dimension d, showing that typically, most concepts require only a small amount of information leakage, especially when the distribution is known.
Contribution
It demonstrates that for most concepts in a class, a proper learning algorithm reveals only O(d) bits of information, extending to a general phenomenon relating distribution knowledge and information leakage.
Findings
Most concepts do not require large information leakage.
Existence of algorithms revealing O(d) bits for typical concepts.
Low information learners with known distributions imply low leakage without distribution knowledge.
Abstract
How many bits of information are revealed by a learning algorithm for a concept class of VC-dimension ? Previous works have shown that even for the amount of information may be unbounded (tend to with the universe size). Can it be that all concepts in the class require leaking a large amount of information? We show that typically concepts do not require leakage. There exists a proper learning algorithm that reveals bits of information for most concepts in the class. This result is a special case of a more general phenomenon we explore. If there is a low information learner when the algorithm {\em knows} the underlying distribution on inputs, then there is a learner that reveals little information on an average concept {\em without knowing} the distribution on inputs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Computability, Logic, AI Algorithms · Advanced Bandit Algorithms Research
