Learning, compression, and leakage: Minimising classification error via meta-universal compression principles
Fernando E. Rosas, Pedro A.M. Mediano, Michael Gastpar

TL;DR
This paper introduces a NML-based classification strategy that achieves heuristic PAC learning and bounds misclassification rates using maximal leakage, bridging learning and compression with privacy considerations.
Contribution
It proposes a novel NML-based decision method for supervised learning that guarantees PAC learning and incorporates privacy leakage metrics.
Findings
Attains heuristic PAC learning across various models.
Bounds misclassification rate with maximal leakage.
Connects compression principles with privacy-aware classification.
Abstract
Learning and compression are driven by the common aim of identifying and exploiting statistical regularities in data, which opens the door for fertile collaboration between these areas. A promising group of compression techniques for learning scenarios is normalised maximum likelihood (NML) coding, which provides strong guarantees for compression of small datasets - in contrast with more popular estimators whose guarantees hold only in the asymptotic limit. Here we consider a NML-based decision strategy for supervised classification problems, and show that it attains heuristic PAC learning when applied to a wide variety of models. Furthermore, we show that the misclassification rate of our method is upper bounded by the maximal leakage, a recently proposed metric to quantify the potential of data leakage in privacy-sensitive scenarios.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
