Over-Alignment vs Over-Fitting: The Role of Feature Learning Strength in Generalization
Taesun Yeom, Taehyeok Ha, Jaeho Lee

TL;DR
This paper investigates how feature learning strength (FLS) influences neural network generalization, revealing an optimal FLS that balances over-alignment and over-fitting, supported by empirical and theoretical analyses.
Contribution
It introduces the concept of an optimal FLS for neural networks trained with early stopping, challenging the idea that stronger feature learning always improves generalization.
Findings
Existence of an optimal FLS balancing over-alignment and over-fitting.
Empirical evidence of a non-monotonic relationship between FLS and generalization.
Theoretical analysis explaining the trade-off controlling FLS effects.
Abstract
Feature learning strength (FLS), i.e., the inverse of the effective output scaling of a model, plays a critical role in shaping the optimization dynamics of neural nets. While its impact has been extensively studied under the asymptotic regimes -- both in training time and FLS -- existing theory offers limited insight into how FLS affects generalization in practical settings, such as when training is stopped upon reaching a target training risk. In this work, we investigate the impact of FLS on generalization in deep networks under such practical conditions. Through empirical studies, we first uncover the emergence of an -- neither too small nor too large -- that yields substantial generalization gains. This finding runs counter to the prevailing intuition that stronger feature learning universally improves generalization. To explain this phenomenon, we develop a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis · Neural Networks and Reservoir Computing
