The Description Length of Deep Learning Models
L\'eonard Blier, Yann Ollivier

TL;DR
This paper investigates the data compression capabilities of deep neural networks through the lens of Solomonoff's theory and the Minimum Description Length principle, revealing that simple encoding methods outperform variational approaches in compression efficiency.
Contribution
It demonstrates that deep neural networks can effectively compress data, and shows that simple encoding methods outperform variational techniques in this task, challenging assumptions about model complexity.
Findings
Deep neural networks can compress training data effectively.
Variational methods provide poor compression bounds despite optimization.
Simple incremental encoding methods yield excellent compression results.
Abstract
Solomonoff's general theory of inference and the Minimum Description Length principle formalize Occam's razor, and hold that a good model of data is a model that is good at losslessly compressing the data, including the cost of describing the model itself. Deep neural networks might seem to go against this principle given the large number of parameters to be encoded. We demonstrate experimentally the ability of deep neural networks to compress the training data even when accounting for parameter encoding. The compression viewpoint originally motivated the use of variational methods in neural networks. Unexpectedly, we found that these variational methods provide surprisingly poor compression bounds, despite being explicitly built to minimize such bounds. This might explain the relatively poor practical performance of variational methods in deep learning. On the other hand, simple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning and Data Classification · Machine Learning in Healthcare
