Beyond the Birkhoff Polytope: Spectral-Sphere-Constrained Hyper-Connections
Zhaoyi Liu, Haichuan Zhang, Ang Li

TL;DR
This paper introduces Spectral-Sphere-Constrained Hyper-Connections (sHC), a novel approach that overcomes limitations of Birkhoff polytope constraints in hyper-connections, enabling more expressive and stable residual matrices for neural networks.
Contribution
The paper proposes a spectral sphere constraint for hyper-connections, addressing limitations of Birkhoff polytope constraints by allowing negative entries and improving expressivity and stability.
Findings
sHC enables subtractive feature interactions.
It avoids unstable Sinkhorn iterations.
It improves residual matrix expressivity.
Abstract
Hyper-Connections (HC) generalize residual connections into multiple streams, employing residual matrices for cross-stream feature mixing to enrich model expressivity. However, unconstrained mixing disrupts the identity mapping property intrinsic to the residual connection, causing unstable training. To address this, Manifold-Constrained Hyper-Connections (mHC) and its variant restrict these matrices to the Birkhoff polytope (doubly stochastic matrices) via Sinkhorn iterations or permutation-based parameterizations. We reveal three limitations of this polytope constraint: (1) identity degeneration, where learned matrices collapse around the identity and diminish cross-stream interactions, (2) an expressivity bottleneck, as the non-negativity constraint prevents subtractive feature disentanglement, and (3) parameterization inefficiencies, manifesting as unstable Sinkhorn iterations or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Graph Neural Networks · Stochastic Gradient Optimization Techniques
