Asymptotics of Discrete MDL for Online Prediction
Jan Poland, Marcus Hutter

TL;DR
This paper analyzes the asymptotic behavior of two-part MDL in online prediction for countable model classes, proving convergence to true distributions and deriving bounds on prediction loss.
Contribution
It introduces static and dynamic MDL prediction methods for online learning and proves their almost sure convergence under certain conditions.
Findings
MDL predictions converge almost surely to true data-generating distributions.
Finite bounds on prediction loss are established, though exponentially worse than Bayesian methods.
Results apply broadly to sequence prediction, classification, regression, and universal induction.
Abstract
Minimum Description Length (MDL) is an important principle for induction and prediction, with strong relations to optimal Bayesian learning. This paper deals with learning non-i.i.d. processes by means of two-part MDL, where the underlying model class is countable. We consider the online learning framework, i.e. observations come in one by one, and the predictor is allowed to update his state of mind after each time step. We identify two ways of predicting by MDL for this setup, namely a static} and a dynamic one. (A third variant, hybrid MDL, will turn out inferior.) We will prove that under the only assumption that the data is generated by a distribution contained in the model class, the MDL predictions converge to the true values almost surely. This is accomplished by proving finite bounds on the quadratic, the Hellinger, and the Kullback-Leibler loss of the MDL learner, which are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
