Binarized Neural Networks Converge Toward Algorithmic Simplicity: Empirical Support for the Learning-as-Compression Hypothesis

Eduardo Y. Sakabe; Felipe S. Abrah\~ao; Alexandre Sim\~oes; Esther Colombini; Paula Costa; Ricardo Gudwin; and Hector Zenil

arXiv:2505.20646·cs.LG·September 19, 2025

Binarized Neural Networks Converge Toward Algorithmic Simplicity: Empirical Support for the Learning-as-Compression Hypothesis

Eduardo Y. Sakabe, Felipe S. Abrah\~ao, Alexandre Sim\~oes, Esther Colombini, Paula Costa, Ricardo Gudwin, and Hector Zenil

PDF

Open Access

TL;DR

This paper provides empirical evidence that training Binarized Neural Networks aligns with the concept of learning as a process of algorithmic compression, using algorithmic complexity measures to better understand neural network dynamics.

Contribution

It introduces a novel application of algorithmic information theory, specifically the Block Decomposition Method, to analyze neural network training and supports the learning-as-compression hypothesis.

Findings

01

BDM correlates more strongly with training loss than entropy.

02

Training involves progressive internalization of structured regularities.

03

Supports viewing learning as algorithmic compression.

Abstract

Understanding and controlling the informational complexity of neural networks is a central challenge in machine learning, with implications for generalization, optimization, and model capacity. While most approaches rely on entropy-based loss functions and statistical metrics, these measures often fail to capture deeper, causally relevant algorithmic regularities embedded in network structure. We propose a shift toward algorithmic information theory, using Binarized Neural Networks (BNNs) as a first proxy. Grounded in algorithmic probability (AP) and the universal distribution it defines, our approach characterizes learning dynamics through a formal, causally grounded lens. We apply the Block Decomposition Method (BDM) -- a scalable approximation of algorithmic complexity based on AP -- and demonstrate that it more closely tracks structural changes during training than entropy,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications