The Impact of Anisotropic Covariance Structure on the Training Dynamics and Generalization Error of Linear Networks
Taishi Watanabe, Ryo Karakida, Jun-nosuke Teramae

TL;DR
This paper investigates how anisotropic data covariance structures influence the learning process and generalization in linear neural networks, revealing phase-based dynamics and the importance of data-task alignment.
Contribution
It provides a theoretical analysis of learning dynamics and generalization errors under anisotropic data, a less understood aspect compared to isotropic cases.
Findings
Learning occurs in two phases driven by data structure.
Alignment of data spikes with the task improves generalization.
Analytical expression for generalization error derived.
Abstract
The success of deep neural networks largely depends on the statistical structure of the training data. While learning dynamics and generalization on isotropic data are well-established, the impact of pronounced anisotropy on these crucial aspects is not yet fully understood. We examine the impact of data anisotropy, represented by a spiked covariance structure, a canonical yet tractable model, on the learning dynamics and generalization error of a two-layer linear network in a linear regression setting. Our analysis reveals that the learning dynamics proceed in two distinct phases, governed initially by the input-output correlation and subsequently by other principal directions of the data structure. Furthermore, we derive an analytical expression for the generalization error, quantifying how the alignment of the spike structure of the data with the learning task improves performance.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Neural Networks and Reservoir Computing · Neural Networks and Applications
