MuonRec: Shifting the Optimizer Paradigm Beyond Adam in Scalable Generative Recommendation
Rong Shan, Aofan Yu, Bo Chen, Kuo Cai, Qiang Luo, Ruiming Tang, Han Li, Weiwen Liu, Weinan Zhang, Jianghao Lin

TL;DR
This paper introduces MuonRec, a novel optimizer framework for recommender systems that outperforms Adam/AdamW by reducing training steps and enhancing ranking quality, especially in generative models.
Contribution
MuonRec is the first to apply the Muon optimizer to RecSys, improving training efficiency and recommendation quality over traditional optimizers.
Findings
Reduces training steps by 32.4% on average.
Achieves 12.6% relative gain in NDCG@10.
Outperforms Adam/AdamW baselines in various settings.
Abstract
Recommender systems (RecSys) are increasingly emphasizing scaling, leveraging larger architectures and more interaction data to improve personalization. Yet, despite the optimizer's pivotal role in training, modern RecSys pipelines almost universally default to Adam/AdamW, with limited scrutiny of whether these choices are truly optimal for recommendation. In this work, we revisit optimizer design for scalable recommendation and introduce MuonRec, the first framework that brings the recently proposed Muon optimizer to RecSys training. Muon performs orthogonalized momentum updates for 2D weight matrices via Newton-Schulz iteration, promoting diverse update directions and improving optimization efficiency. We develop an open-source training recipe for recommendation models and evaluate it across both traditional sequential recommenders and modern generative recommenders. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Machine Learning in Materials Science · Epigenetics and DNA Methylation
