Double Descent Risk and Volume Saturation Effects: A Geometric Perspective
Prasad Cheema, Mahito Sugiyama

TL;DR
This paper explores the double descent risk phenomenon in machine learning through a geometric lens, analyzing how model volume impacts generalization and challenges traditional notions of model complexity.
Contribution
It introduces a geometric perspective based on the logarithm of model volume to explain double descent and volume saturation effects in specific model classes.
Findings
Logarithm of model volume decomposes into components explaining double descent.
Model volume analysis clarifies why generalization error can decrease with increasing model complexity.
Geometric insights connect model volume to classical model selection criteria like AIC and BIC.
Abstract
The appearance of the double-descent risk phenomenon has received growing interest in the machine learning and statistics community, as it challenges well-understood notions behind the U-shaped train-test curves. Motivated through Rissanen's minimum description length (MDL), Balasubramanian's Occam's Razor, and Amari's information geometry, we investigate how the logarithm of the model volume: , works to extend intuition behind the AIC and BIC model selection criteria. We find that for the particular model classes of isotropic linear regression and statistical lattices, the term may be decomposed into a sum of distinct components, each of which assist in their explanations of the appearance of this phenomenon. In particular they suggest why generalization error does not necessarily continue to grow with increasing model dimensionality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Statistical Methods and Inference · Statistical Mechanics and Entropy
MethodsLinear Regression
