On statistical learning via the lens of compression
Ofir David, Shay Moran, Amir Yehudayoff

TL;DR
This paper explores the deep connection between sample compression schemes and statistical learning, establishing equivalences and extending results from binary classification to multiclass and general settings, with applications to learnability and uniform convergence.
Contribution
It proves that learnability is equivalent to logarithmic sample compression in multiclass classification and extends this to approximate compression in general learning settings.
Findings
Learnability is equivalent to logarithmic sample compression in multiclass categorization.
Uniform convergence implies constant-size compression in multiclass problems.
A dichotomy exists: non-trivial compression implies logarithmic compression in multiclass classification.
Abstract
This work continues the study of the relationship between sample compression schemes and statistical learning, which has been mostly investigated within the framework of binary classification. The central theme of this work is establishing equivalences between learnability and compressibility, and utilizing these equivalences in the study of statistical learning theory. We begin with the setting of multiclass categorization (zero/one loss). We prove that in this case learnability is equivalent to compression of logarithmic sample size, and that uniform convergence implies compression of constant size. We then consider Vapnik's general learning setting: we show that in order to extend the compressibility-learnability equivalence to this case, it is necessary to consider an approximate variant of compression. Finally, we provide some applications of the compressibility-learnability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Computability, Logic, AI Algorithms · Imbalanced Data Classification Techniques
