Minimum Description Length Principle in Supervised Learning with Application to Lasso
Masanori Kawakita, Jun'ichi Takeuchi

TL;DR
This paper extends Barron and Cover's MDL theory to supervised learning, providing finite-sample risk bounds and applying it to derive new bounds for Lasso with random design, even when features outnumber samples.
Contribution
An extension of BC theory to supervised learning that offers finite-sample risk bounds with minimal assumptions and applies to Lasso with random design.
Findings
Risk bounds hold for any finite sample size and feature number.
The bounds are valid even when the number of features exceeds samples.
Numerical simulations illustrate the behavior of the regret bounds.
Abstract
The minimum description length (MDL) principle in supervised learning is studied. One of the most important theories for the MDL principle is Barron and Cover's theory (BC theory), which gives a mathematical justification of the MDL principle. The original BC theory, however, can be applied to supervised learning only approximately and limitedly. Though Barron et al. recently succeeded in removing a similar approximation in case of unsupervised learning, their idea cannot be essentially applied to supervised learning in general. To overcome this issue, an extension of BC theory to supervised learning is proposed. The derived risk bound has several advantages inherited from the original BC theory. First, the risk bound holds for finite sample size. Second, it requires remarkably few assumptions. Third, the risk bound has a form of redundancy of the two-stage code for the MDL procedure.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Multi-Objective Optimization Algorithms · Probabilistic and Robust Engineering Design · Statistical Methods and Inference
MethodsMinimum Description Length
