Insights on Muon from Simple Quadratics

Antoine Gonon; Andreea-Alexandra Mu\c{s}at; Nicolas Boumal

arXiv:2602.11948·math.OC·February 13, 2026

Insights on Muon from Simple Quadratics

Antoine Gonon, Andreea-Alexandra Mu\c{s}at, Nicolas Boumal

PDF

Open Access

TL;DR

This paper investigates the empirical success of Muon optimization by analyzing its behavior on simple quadratic functions, revealing effects beyond traditional local and worst-case analyses.

Contribution

It uncovers how polar approximation errors and structural properties influence Muon's performance, challenging existing theoretical explanations.

Findings

01

Polar step approximation errors can improve finite-time performance.

02

Structural properties of objectives affect optimization constants.

03

Existing theories overlook these effects, requiring new explanations.

Abstract

Muon updates weight matrices along (approximate) polar factors of the gradients and has shown strong empirical performance in large-scale training. Existing attempts at explaining its performance largely focus on single-step comparisons (on quadratic proxies) and worst-case guarantees that treat the inexactness of the polar-factor as a nuisance ``to be argued away''. We show that already on simple strongly convex functions such as $L (W) = \frac{1}{2} ∥ W ∥_{F}^{2}$ , these perspectives are insufficient, suggesting that understanding Muon requires going beyond local proxies and pessimistic worst-case bounds. Instead, our analysis exposes two observations that already affect behavior on simple quadratics and are not well captured by prevailing abstractions: (i) approximation error in the polar step can qualitatively alter discrete-time dynamics and improve reachability and finite-time…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParticle physics theoretical and experimental studies · Muon and positron interactions and applications · Computational Physics and Python Applications