Deep neural networks have an inbuilt Occam's razor
Chris Mingard, Henry Rees, Guillermo Valle-P\'erez, Ard A. Louis

TL;DR
This paper uses a Bayesian framework to analyze how deep neural networks inherently favor simpler functions, which explains their success in structured data tasks despite their overparameterization.
Contribution
It introduces a Bayesian perspective to understand DNNs, revealing an intrinsic Occam's razor bias towards simple functions that aids generalization.
Findings
Structured data combined with simplicity bias explains DNN success.
The prior over functions is influenced by network architecture and training regime.
The analysis accurately predicts DNN behavior on Boolean classification tasks.
Abstract
The remarkable performance of overparameterized deep neural networks (DNNs) must arise from an interplay between network architecture, training algorithms, and structure in the data. To disentangle these three components, we apply a Bayesian picture, based on the functions expressed by a DNN, to supervised learning. The prior over functions is determined by the network, and is varied by exploiting a transition between ordered and chaotic regimes. For Boolean function classification, we approximate the likelihood using the error spectrum of functions on data. When combined with the prior, this accurately predicts the posterior, measured for DNNs trained with stochastic gradient descent. This analysis reveals that structured data, combined with an intrinsic Occam's razor-like inductive bias towards (Kolmogorov) simple functions that is strong enough to counteract the exponential growth of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification · Machine Learning and Algorithms
