Why Deep Jacobian Spectra Separate: Depth-Induced Scaling and Singular-Vector Alignment
Nathana\"el Haas, Fran\c{c}ois Gatine, Augustin M Cosse, Zied Bouraoui

TL;DR
This paper investigates the spectral properties of deep Jacobians, revealing how depth induces exponential scaling and alignment of singular vectors, which explains the implicit bias in deep network training.
Contribution
It introduces a theoretical framework linking depth-induced spectral phenomena to implicit bias, supported by empirical validation in fixed-gates models.
Findings
Depth causes exponential scaling of singular values.
Strong spectral separation leads to singular-vector alignment.
Results support a mechanistic understanding of low-rank Jacobian structures.
Abstract
Understanding why gradient-based training in deep networks exhibits strong implicit bias remains challenging, in part because tractable singular-value dynamics are typically available only for balanced deep linear models. We propose an alternative route based on two theoretically grounded and empirically testable signatures of deep Jacobians: depth-induced exponential scaling of ordered singular values and strong spectral separation. Adopting a fixed-gates view of piecewise-linear networks, where Jacobians reduce to products of masked linear maps within a single activation region, we prove the existence of Lyapunov exponents governing the top singular values at initialization, give closed-form expressions in a tractable masked model, and quantify finite-depth corrections. We further show that sufficiently strong separation forces singular-vector alignment in matrix products, yielding an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Neural Networks and Reservoir Computing · Model Reduction and Neural Networks
