OLion: Approaching the Hadamard Ideal by Intersecting Spectral and $\ell_{\infty}$ Implicit Biases
Zixiao Wang, Yifei Shen, Huishuai Zhang

TL;DR
OLion introduces a novel optimizer that combines spectral and coordinate control to efficiently approximate the Hadamard ideal, demonstrating superior performance in large-scale language and vision training tasks.
Contribution
The paper presents OLion, a new optimizer that integrates spectral control with $ ext{l}_ ext{infty}$ biases, providing an efficient approximation to the Hadamard ideal with convergence guarantees.
Findings
Matches or outperforms AdamW and Muon in large-scale training
Mitigates optimizer mismatch during fine-tuning
Proven convergence under mild assumptions
Abstract
Many optimizers can be interpreted as steepest-descent methods under norm-induced geometries, and thus inherit corresponding implicit biases. We introduce \nameA{} (\fullname{}), which combines spectral control from orthogonalized update directions with -style coordinate control from sign updates. \nameA{} forms a Lion-style momentum direction, approximately orthogonalizes it via a few Newton--Schulz iterations, and then applies an entrywise sign, providing an efficient approximation to taking a maximal step over the intersection of the spectral and constraint sets (a scaled Hadamard-like set for matrix parameters). Despite the strong nonlinearity of orthogonalization and sign, we prove convergence under a mild, empirically verified diagonal-isotropy assumption. Across large-scale language and vision training, including GPT-2 and Llama pretraining, SiT image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Quantum Computing Algorithms and Architecture · Advanced Bandit Algorithms Research
