Generalization Bound of Gradient Flow through Training Trajectory and Data-dependent Kernel
Yilan Chen, Zhichao Wang, Wei Huang, Andi Han, Taiji Suzuki, Arya Mazumdar

TL;DR
This paper derives a data-dependent generalization bound for gradient flow in neural networks, using a novel loss path kernel that captures training dynamics and improves understanding of neural network generalization.
Contribution
It introduces the loss path kernel (LPK) to analyze gradient flow, providing tighter, data-adaptive generalization bounds that connect neural networks and kernel methods.
Findings
The bound aligns with classical Rademacher complexity bounds for kernel methods.
Gradient norms along training influence generalization performance.
Numerical experiments show bounds correlate with actual generalization gaps.
Abstract
Gradient-based optimization methods have shown remarkable empirical success, yet their theoretical generalization properties remain only partially understood. In this paper, we establish a generalization bound for gradient flow that aligns with the classical Rademacher complexity bounds for kernel methods-specifically those based on the RKHS norm and kernel trace-through a data-dependent kernel called the loss path kernel (LPK). Unlike static kernels such as NTK, the LPK captures the entire training trajectory, adapting to both data and optimization dynamics, leading to tighter and more informative generalization guarantees. Moreover, the bound highlights how the norm of the training loss gradients along the optimization trajectory influences the final generalization performance. The key technical ingredients in our proof combine stability analysis of gradient flow with uniform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications
