RMNP: Row-Momentum Normalized Preconditioning for Scalable Matrix-Based Optimization

Shenyang Deng; Zhuoli Ouyang; Tianyu Pang; Zihang Liu; Ruochen Jin; Shuhua Yu; Yaoqing Yang

arXiv:2603.20527·cs.LG·May 14, 2026

RMNP: Row-Momentum Normalized Preconditioning for Scalable Matrix-Based Optimization

Shenyang Deng, Zhuoli Ouyang, Tianyu Pang, Zihang Liu, Ruochen Jin, Shuhua Yu, Yaoqing Yang

PDF

1 Repo

TL;DR

RMNP introduces a computationally efficient preconditioning method for deep learning optimization by replacing Newton-Schulz iteration with row-wise normalization, maintaining performance while reducing complexity.

Contribution

The paper proposes RMNP, a novel optimizer that simplifies preconditioning in neural network training, achieving similar results to Muon with lower computational cost.

Findings

01

RMNP reduces per-iteration complexity from O(mn·min(m,n)) to O(mn).

02

RMNP maintains comparable optimization performance to Muon.

03

Experiments on large language models show RMNP's efficiency and effectiveness.

Abstract

Preconditioned adaptive methods have gained significant attention for training deep neural networks, as they capture rich curvature information of the loss landscape. The central challenge in this field lies in balancing preconditioning effectiveness with computational efficiency of implementing the preconditioner. Among recent advances, Muon stands out by using Newton-Schulz iteration to obtain preconditioned updates without explicitly constructing the preconditioning matrix. Despite its advantages, the efficiency of Muon still leaves room for further improvement. In this paper, we introduce RMNP (Row Momentum Normalized Preconditioning), an optimizer that replaces Newton-Schulz iteration with a simple row-wise ( $d_{in}$ ) $ℓ_{2}$ normalization operation, motivated by the empirically observed diagonal block structure of the Transformer layerwise Hessian. We empirically verified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Dominator-Index/RMNP
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.