Criticality and Saturation in Orthogonal Neural Networks

Max Guillen; Jan E. Gerken

arXiv:2605.06563·cs.LG·May 15, 2026

Criticality and Saturation in Orthogonal Neural Networks

Max Guillen, Jan E. Gerken

PDF

TL;DR

This paper provides a theoretical framework explaining why orthogonal initializations stabilize neural network tensors at finite widths, enhancing understanding of their training performance and depth stability.

Contribution

It derives explicit recursion relations for finite-width tensor statistics in orthogonal networks and extends Feynman diagram methods to all orders, explaining their stability.

Findings

01

Recursion relations reproduce observed tensor stability in orthogonal networks.

02

Theoretical results match Monte Carlo simulations and large-depth expansions.

03

Orthogonal initialization improves stability and training performance.

Abstract

It has been known for a long time that initializing weight matrices to be orthogonal instead of having i.i.d. Gaussian components can improve training performance. This phenomenon can be analyzed using finite-width corrections, where the infinite-width statistics are supplemented by a power series in $1/ width$ . In particular, recent empirical results by Day et al. show that the tensors appearing in this treatment stabilize for large depth, as opposed to the tensors of i.i.d.-initialized networks. In this article, we derive explicit layer-wise recursion relations for the tensors appearing in the finite-width expansion of the network statistics in the case of orthogonal initializations. We also provide an extension of recently-introduced Feynman diagrams for the corresponding recursions in the i.i.d.-case which are valid to all orders in $1/ width$ . Finally, we show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.