Optimal Projection-Free Adaptive SGD for Matrix Optimization
Dmitry Kovalev

TL;DR
This paper introduces an improved, practical, projection-free adaptive SGD algorithm for matrix optimization that avoids hyperparameter tuning and achieves dimension-independent convergence guarantees.
Contribution
It provides the first practical, accelerated, projection-free adaptive SGD with Nesterov acceleration and improved theoretical guarantees for matrix optimization.
Findings
Developed a stable analysis of Leon’s preconditioner.
Achieved dimension-independent convergence rates.
Created a unified analysis framework for accelerated projection-free SGD.
Abstract
Recently, Jiang et al. [2026] developed Leon, a practical variant of One-sided Shampoo [Xie et al., 2025a, An et al., 2025] algorithm for online convex optimization, which does not require computing a costly quadratic projection at each iteration. Unfortunately, according to the existing analysis, Leon requires tuning an additional hyperparameter in its preconditioner and cannot achieve dimension-independent convergence guarantees for convex optimization problems beyond the bounded gradients assumption. In this paper, we resolve this issue by proving certain stability properties of Leon's preconditioner. Using our improved analysis, we show that tuning the extra hyperparameter can be avoided and, more importantly, develop the first practical variant of One-sided Shampoo with Nesterov acceleration, which does not require computing projections at each iteration. As a side contribution, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
