From Optimization Dynamics to Generalization Bounds via {\L}ojasiewicz Gradient Inequality
Fusheng Liu, Haizhao Yang, Soufiane Hayou, Qianxiao Li

TL;DR
This paper introduces a framework linking optimization trajectories to generalization bounds in machine learning, utilizing the Uniform-LGI property to derive convergence rates and generalization estimates for various models.
Contribution
It proposes the Uniform-LGI as a key property to connect optimization dynamics with generalization, providing new bounds applicable to multiple models.
Findings
Derived convergence rates for gradient flow algorithms.
Established generalization bounds for linear regression, kernel regression, and neural networks.
Matched or extended previous generalization results.
Abstract
Optimization and generalization are two essential aspects of statistical machine learning. In this paper, we propose a framework to connect optimization with generalization by analyzing the generalization error based on the optimization trajectory under the gradient flow algorithm. The key ingredient of this framework is the Uniform-LGI, a property that is generally satisfied when training machine learning models. Leveraging the Uniform-LGI, we first derive convergence rates for gradient flow algorithm, then we give generalization bounds for a large class of machine learning models. We further apply our framework to three distinct machine learning models: linear regression, kernel regression, and two-layer neural networks. Through our approach, we obtain generalization estimates that match or extend previous results.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Stochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques
