Towards Understanding Generalization via Decomposing Excess Risk Dynamics
Jiaye Teng, Jianhao Ma, Yang Yuan

TL;DR
This paper introduces a decomposition framework to analyze neural network generalization by separating signal and noise in excess risk dynamics, leading to improved stability-based bounds and better understanding of overparameterized models.
Contribution
It proposes a novel decomposition approach that refines stability-based generalization bounds by analyzing signal and noise separately, applicable to both linear and non-linear regimes.
Findings
Decomposition improves stability-based bounds in linear and non-linear models.
Framework verified on neural networks and linear regression.
Better understanding of slow convergence on noise in neural networks.
Abstract
Generalization is one of the fundamental issues in machine learning. However, traditional techniques like uniform convergence may be unable to explain generalization under overparameterization. As alternative approaches, techniques based on stability analyze the training dynamics and derive algorithm-dependent generalization bounds. Unfortunately, the stability-based bounds are still far from explaining the surprising generalization in deep learning since neural networks usually suffer from unsatisfactory stability. This paper proposes a novel decomposition framework to improve the stability-based bounds via a more fine-grained analysis of the signal and noise, inspired by the observation that neural networks converge relatively slowly when fitting noise (which indicates better stability). Concretely, we decompose the excess risk dynamics and apply the stability-based bound only on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Neural Networks and Applications
MethodsLinear Regression
