The Geometric Occam's Razor Implicit in Deep Learning
Benoit Dherin, Michael Munn, and David G.T. Barrett

TL;DR
This paper proposes that over-parameterized deep neural networks are implicitly regularized by a geometric complexity measure, called the Geometric Occam's Razor, which influences their solutions.
Contribution
It introduces the concept of a Geometric Occam's Razor as an implicit regularizer in deep learning, linking geometric model complexity to neural network solutions.
Findings
In 1D regression, geometric complexity equals arc length.
In higher dimensions, it relates to Dirichlet energy.
Dirichlet energy aligns with observed regularization in ResNets on CIFAR-10.
Abstract
In over-parameterized deep neural networks there can be many possible parameter configurations that fit the training data exactly. However, the properties of these interpolating solutions are poorly understood. We argue that over-parameterized neural networks trained with stochastic gradient descent are subject to a Geometric Occam's Razor; that is, these networks are implicitly regularized by the geometric model complexity. For one-dimensional regression, the geometric model complexity is simply given by the arc length of the function. For higher-dimensional settings, the geometric model complexity depends on the Dirichlet energy of the function. We explore the relationship between this Geometric Occam's Razor, the Dirichlet energy and other known forms of implicit regularization. Finally, for ResNets trained on CIFAR-10, we observe that Dirichlet energy measurements are consistent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
