Intrinsic Muon: Spectral Optimization on Riemannian Matrix Manifolds
Yibang Li, Bihari Lal Pandey, Ravi Sah, Andi Han, Cyrus Mostajeran, Pratik Jawanpuria, Bamdev Mishra

TL;DR
This paper introduces intrinsic Muon (iMuon), a Riemannian manifold-based spectral optimization framework that provides closed-form updates and convergence guarantees for various matrix manifolds and norms, improving efficiency and symmetry preservation.
Contribution
It develops a unified intrinsic optimization framework on Riemannian manifolds that yields closed-form solutions and convergence guarantees for multiple matrix constraints and norms.
Findings
iMuon achieves closed-form updates on several manifolds.
Convergence rates depend only on manifold dimension, often only on rank.
Experiments show effective fine-tuning of LLMs, image classification, and subspace learning.
Abstract
Muon and related norm-constrained matrix optimizers have become central to large-scale learning problems. They are formulated as a linear maximization oracle (LMO) over an ambient matrix-norm ball in unconstrained Euclidean space. However, these do not generalize cleanly to manifold-valued parameters such as low-rank factorizations, orthogonality constraints, or symmetric positive definite (SPD) matrices. Naively restricting the Muon LMO to the tangent space (i) breaks quotient symmetries and (ii) couples the tangent-space constraint with an ambient norm bound, thereby obstructing closed-form solutions on various manifolds of interest. We resolve both issues with a single observation: every Riemannian metric canonically lifts a unitarily invariant Euclidean norm to an intrinsic norm on each tangent space, and the resulting intrinsic norm constrained LMO is symmetry preserving. Building…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
