Rethinking Muon Beyond Pretraining: Spectral Failures and High-Pass Remedies for VLA and RLVR

Chongyu Fan; Gaowen Liu; Mingyi Hong; Ramana Rao Kompella; Sijia Liu

arXiv:2605.19282·cs.LG·May 20, 2026

Rethinking Muon Beyond Pretraining: Spectral Failures and High-Pass Remedies for VLA and RLVR

Chongyu Fan, Gaowen Liu, Mingyi Hong, Ramana Rao Kompella, Sijia Liu

PDF

1 Repo

TL;DR

This paper introduces Pion, a spectral high-pass optimizer that improves upon Muon by addressing limitations in vision-language-action training and reinforcement learning with verifiable rewards, leading to better performance and stability.

Contribution

The paper proposes Pion, a novel spectral high-pass iteration replacing Muon’s uniform whitening, enhancing training stability and effectiveness in VLA and RLVR tasks.

Findings

01

Pion outperforms Muon and AdamW in VLA training success rates.

02

Pion achieves higher accuracy on grasp-and-place tasks with a real robot.

03

Pion maintains stability and outperforms in RLVR benchmarks.

Abstract

Muon is a matrix-aware optimizer that leverages Newton-Schulz (NS) iterations to enforce spectral gradient orthogonalization by driving all singular values of the momentum matrix toward 1. While this uniform spectral whitening enhances exploration and outperforms AdamW in LLM pretraining, we show it could lead to fundamental limitations beyond pretraining in two regimes: (i) cross-modality vision-language-action (VLA) training, where inherently low-rank action-module gradients cause amplification of noisy tail directions, and (ii) reinforcement learning with verifiable rewards (RLVR), where low-SNR gradients and the need to preserve per-head specialization from prior training make whitening unstable. To address these challenges, we propose Pion, a drop-in replacement for Muon that preserves its computational efficiency while replacing uniform spectral whitening with a two-stage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

optml-group/Pion
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.