Gradient-Direction Sensitivity Reveals Linear-Centroid Coupling Hidden by Optimizer Trajectories
Yongzhong Xu

TL;DR
This paper demonstrates that analyzing loss gradients instead of optimizer updates reveals a much stronger coupling between features and directions in neural networks, clarifying feature formation and the effects of gradient rank constraints.
Contribution
It introduces a gradient-based SVD diagnostic that uncovers hidden feature coupling and shows how gradient rank constraints influence model training and feature formation.
Findings
Gradient-based SVD reveals stronger feature coupling than update-based SVD.
Gradient rank constraints accelerate grokking by approximately 2.3 times.
Full-rank AdamW attention updates are highly rank-redundant under the studied hyperparameters.
Abstract
We show that replacing the rolling SVD of AdamW updates with a rolling SVD of loss gradients changes the diagnostic by 1-2 orders of magnitude. Performing SVD on the loss gradient instead of the AdamW update increases the measured perturbative coupling between SED directions and Linear Centroid Hypothesis (LCH) features from -- to -- across four single-task modular arithmetic operations, eliminating the apparent operation dependence in the original measurement. On a multitask transformer with a shared encoder, update-based SED gives -- an apparent failure of the diagnostic -- while per-operation gradient-based SED recovers -- across all four operations. Gradient aggregation across competing tasks is the main obstruction; performing SVD on per-task gradients resolves it. A causal intervention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
