The E$\Delta$-MHC-Geo Transformer: Adaptive Geodesic Operations with Guaranteed Orthogonality
Arash Shahmansoori

TL;DR
The paper introduces the E$\Delta$-MHC-Geo Transformer, a novel architecture that guarantees orthogonality in residual connections using adaptive geodesic operations, improving stability and rotation accuracy.
Contribution
It proposes a hybrid approach combining Cayley rotation and Householder reflection to maintain orthogonality across all inputs, addressing limitations of previous methods.
Findings
Achieves 1.9x better long-horizon stability over JPmHC.
Attains 4.5x lower rotation loss near $\pi$ compared to JPmHC.
Maintains strong norm preservation with 0.001 mean deviation.
Abstract
We present the E-MHC-Geo Transformer, a novel architecture that unifies Manifold-Constrained Hyper-Connections (mHC), Deep Delta Learning (DDL), and the Cayley transform to obtain input-adaptive, unconditionally orthogonal residual connections. Unlike DDL, whose Householder operator is orthogonal only at , our Data-Dependent Cayley rotation preserves orthogonality for all and all inputs. To handle negation, an eigenvalue case that Cayley provably excludes, we introduce the E-MHC-Geo Hybrid, which combines Cayley rotation with Householder reflection via a learned operator-selection gate . A midpoint-collapse regularizer, , encourages boundary gate decisions, where each selected component is orthogonal. In matched-parameter comparisons,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
