ARO: A New Lens On Matrix Optimization For Large Models
Wenbo Gong, Javier Zazo, Qijun Luo, Puqian Wang, James Hensman, Chao Ma

TL;DR
This paper introduces Adaptively Rotated Optimization (ARO), a novel matrix optimization framework that enhances large language model training efficiency by leveraging gradient rotation and a new benchmarking protocol.
Contribution
ARO offers a new paradigm beyond orthogonalization for matrix optimization in LLM training, improving efficiency and sample performance with a rigorous evaluation protocol.
Findings
ARO outperforms AdamW by 1.3-1.35x in LLM pretraining.
ARO surpasses orthogonalization methods by 1.1-1.15x.
ARO maintains efficiency up to 8B parameters and 8x overtrain budget.
Abstract
Matrix-based optimizers have attracted growing interest for improving LLM training efficiency, with significant progress centered on orthogonalization/whitening based methods. While yielding substantial performance gains, a fundamental question arises: can we develop new paradigms beyond orthogonalization, pushing the efficiency frontier further? We present \textbf{Adaptively Rotated Optimization (ARO}, a new matrix optimization framework that treats gradient rotation as a first class design principle. ARO accelerates LLM training by performing normed steepest descent in a rotated coordinate system, where the rotation is determined by a novel norm-informed policy. This perspective yields update rules that go beyond existing orthogonalization and whitening optimizers, improving sample efficiency in practice. To make comparisons reliable, we propose a rigorously controlled benchmarking…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
ARO: A new lens on matrix optimization for LLMs· youtube
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning in Materials Science · Advanced Optimization Algorithms Research
