Towards Data-Algorithm Dependent Generalization: a Case Study on Overparameterized Linear Regression
Jing Xu, Jiaye Teng, Yang Yuan, Andrew Chi-Chih Yao

TL;DR
This paper investigates the generalization behavior of overparameterized linear regression models, emphasizing the importance of data and algorithm interplay, and introduces a new data-algorithm compatibility concept to improve understanding.
Contribution
It introduces the data-algorithm compatibility notion and provides a trajectory-based analysis for overparameterized linear regression, highlighting the role of early stopping.
Findings
Early stopping improves generalization under weaker conditions.
Traditional last-iterate analysis can be overly restrictive.
Data-algorithm compatibility offers a more comprehensive understanding.
Abstract
One of the major open problems in machine learning is to characterize generalization in the overparameterized regime, where most traditional generalization bounds become inconsistent even for overparameterized linear regression. In many scenarios, this failure can be attributed to obscuring the crucial interplay between the training algorithm and the underlying data distribution. This paper demonstrate that the generalization behavior of overparameterized model should be analyzed in a both data-relevant and algorithm-relevant manner. To make a formal characterization, We introduce a notion called data-algorithm compatibility, which considers the generalization behavior of the entire data-dependent training trajectory, instead of traditional last-iterate analysis. We validate our claim by studying the setting of solving overparameterized linear regression with gradient descent.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications · Gaussian Processes and Bayesian Inference
