Holonorm

Daryl Noupa Yongueng; Hamidou Tembine

arXiv:2511.10504·cs.LG·November 14, 2025

Holonorm

Daryl Noupa Yongueng, Hamidou Tembine

PDF

Open Access

TL;DR

Holonorm is a novel normalization method for transformers that preserves orthogonality and stability, addressing limitations of Tanh-based normalization by mapping vectors into the open unit ball.

Contribution

The paper introduces Holonorm, a new normalization technique with residual connections and nonlinearity, suitable for replacing Tanh in transformer models, improving stability and interpretability.

Findings

01

Holonorm preserves orthogonality and invertibility.

02

Holonorm maps vectors into the open unit ball, preventing exploding activations.

03

Holonorm enhances stability in deep transformer models.

Abstract

Normalization is a key point in transformer training . In Dynamic Tanh (DyT), the author demonstrated that Tanh can be used as an alternative layer normalization (LN) and confirmed the effectiveness of the idea. But Tanh itself faces orthogonality, linearity and distortion problems. Due to that, his proposition cannot be reliable. So we propose a Holonorm (hn) which has residual connections and nonlinearity. Holonorm is suitable for replacing Tanh in the context of normalization. Although the HoloNorm expression could be similar to the softsign function in dimension one, softsign is a componentwise function which is not good for tensors and vectors of great dimension. Holonorm preserves the orthogonality, the direction, the invertibility of the signal. Holonorm is also a suitable metric, maps all vectors into the open unit ball. This prevents exploding activations and improves stability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMagneto-Optical Properties and Applications · Magnetic Properties and Applications · Image and Signal Denoising Methods