Applying MDL to Learning Best Model Granularity
Qiong Gao (Chinese Academy of Sciences), Ming Li (University of, California, Santa Barbara), Paul Vitanyi (CWI, University of Amsterdam)

TL;DR
This paper demonstrates that the MDL principle effectively predicts the optimal model granularity in practical tasks, aligning theoretical predictions with experimental results in handwriting recognition and neural network modeling.
Contribution
It shows how MDL can determine the best model granularity in real-world problems, bridging theory and practice.
Findings
MDL accurately predicts optimal sampling interval in handwriting recognition.
MDL correctly identifies best number of hidden nodes in neural network.
Theoretical MDL values match experimental optimal parameters.
Abstract
The Minimum Description Length (MDL) principle is solidly based on a provably ideal method of inference using Kolmogorov complexity. We test how the theory behaves in practice on a general problem in model selection: that of learning the best model granularity. The performance of a model depends critically on the granularity, for example the choice of precision of the parameters. Too high precision generally involves modeling of accidental noise and too low precision may lead to confusion of models that should be distinguished. This precision is often determined ad hoc. In MDL the best model is the one that most compresses a two-part code of the data set: this embodies ``Occam's Razor.'' In two quite different experimental settings the theoretical value determined using MDL coincides with the best value found experimentally. In the first experiment the task is to recognize isolated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputability, Logic, AI Algorithms · Machine Learning and Algorithms · Algorithms and Data Compression
