Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
Arindam Banerjee, Qiaobo Li, Yingxue Zhou

TL;DR
This paper introduces a novel framework using Loss Gradient Gaussian Width (LGGW) to provide generalization and optimization guarantees for deep models, moving beyond traditional complexity measures.
Contribution
It presents the first generalization and optimization guarantees based on LGGW, applicable to deep networks and under flexible gradient domination conditions.
Findings
LGGW-based guarantees extend to models satisfying the PL condition.
Sample reuse in gradient descent maintains gradient alignment if LGGW is small.
Bounds on LGGW for deep networks relate to the Gaussian width of features.
Abstract
Generalization and optimization guarantees on the population loss often rely on uniform convergence based analysis, typically based on the Rademacher complexity of the predictors. The rich representation power of modern models has led to concerns about this approach. In this paper, we present generalization and optimization guarantees in terms of the complexity of the gradients, as measured by the Loss Gradient Gaussian Width (LGGW). First, we introduce generalization guarantees directly in terms of the LGGW under a flexible gradient domination condition, which includes the popular PL (Polyak-{\L}ojasiewicz) condition as a special case. Second, we show that sample reuse in iterative gradient descent does not make the empirical gradients deviate from the population gradients as long as the LGGW is small. Third, focusing on deep networks, we bound their single-sample LGGW in terms of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Image Enhancement Techniques · Advanced Image and Video Retrieval Techniques
