Convergence Rate Analysis of SOAP with Arbitrary Orthogonal Projection Matrices
Huan Li, Zhouchen Lin

TL;DR
This paper analyzes the convergence rate of SOAP, a matrix-based optimizer for deep neural networks, extending it to a more general variant with orthogonal projection matrices.
Contribution
It establishes the convergence rate of SOAP and introduces a generalized version with arbitrary orthogonal projection matrices.
Findings
First convergence rate analysis of SOAP.
Extension to a variant with conditionally independent orthogonal projections.
Applicable to matrices constructed from past information.
Abstract
In this short note, we establish, for the first time, the convergence rate of SOAP, an efficient and popular matrix-based optimizer for training deep neural networks. Our analysis extends to a more general variant of SOAP that admits arbitrary orthogonal projection matrices and requires only that these matrices be conditionally independent of the current stochastic gradient at each iteration. For example, they may be constructed from information available up to the preceding step.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
