A geometric framework for momentum-based optimizers for low-rank training

Steffen Schotth\"ofer; Timon Klein; Jonas Kusch

arXiv:2506.17475·cs.LG·October 31, 2025

A geometric framework for momentum-based optimizers for low-rank training

Steffen Schotth\"ofer, Timon Klein, Jonas Kusch

PDF

1 Video

TL;DR

This paper introduces a geometric framework for momentum-based optimizers tailored for low-rank neural network training, addressing convergence issues and improving efficiency by incorporating the intrinsic geometry of the parameter space.

Contribution

It proposes novel momentum-based optimization strategies derived from dynamical low-rank approximation that better respect the geometric structure of low-rank training landscapes.

Findings

01

Faster convergence in low-rank training scenarios

02

Stronger validation metrics at fixed parameter budgets

03

Classical momentum methods may struggle with low-rank geometries

Abstract

Low-rank pre-training and fine-tuning have recently emerged as promising techniques for reducing the computational and storage costs of large neural networks. Training low-rank parameterizations typically relies on conventional optimizers such as heavy ball momentum methods or Adam. In this work, we identify and analyze potential difficulties that these training methods encounter when used to train low-rank parameterizations of weights. In particular, we show that classical momentum methods can struggle to converge to a local optimum due to the geometry of the underlying optimization landscape. To address this, we introduce novel training strategies derived from dynamical low-rank approximation, which explicitly account for the underlying geometric structure. Our approach leverages and combines tools from dynamical low-rank approximation and momentum-based optimization to design…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A geometric framework for momentum-based optimizers for low-rank training· slideslive