Approximation of the Two-Part MDL Code

Pieter Adriaans (University of Amsterdam); Paul Vitanyi (CWI and; University of Amsterdam)

arXiv:cs/0612095·cs.LG·September 15, 2008·IEEE Trans. Inf. Theory

Approximation of the Two-Part MDL Code

Pieter Adriaans (University of Amsterdam), Paul Vitanyi (CWI and, University of Amsterdam)

PDF

Open Access

TL;DR

This paper discusses the challenges and properties of approximating the optimal two-part MDL code for data, highlighting issues like computation time, convergence, and model fit, using Kolmogorov complexity as a measure.

Contribution

It analyzes the theoretical properties of successive approximations to the two-part MDL code, emphasizing their limitations and the role of Kolmogorov complexity.

Findings

01

Each approximation step may be arbitrarily long to compute.

02

The sequence of models may not monotonically improve fit.

03

The optimal model has nearly the best goodness of fit.

Abstract

Approximation of the optimal two-part MDL code for given data, through successive monotonically length-decreasing two-part MDL codes, has the following properties: (i) computation of each step may take arbitrarily long; (ii) we may not know when we reach the optimum, or whether we will reach the optimum at all; (iii) the sequence of models generated may not monotonically improve the goodness of fit; but (iv) the model associated with the optimum has (almost) the best goodness of fit. To express the practically interesting goodness of fit of individual models for individual data sets we have to rely on Kolmogorov complexity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputability, Logic, AI Algorithms · Algorithms and Data Compression · Machine Learning and Algorithms