SubTrack++ : Gradient Subspace Tracking for Scalable LLM Training

Sahar Rajabi; Nayeema Nonta; Sirisha Rambhatla

arXiv:2502.01586·cs.LG·October 28, 2025

SubTrack++ : Gradient Subspace Tracking for Scalable LLM Training

Sahar Rajabi, Nayeema Nonta, Sirisha Rambhatla

PDF

Open Access 1 Video

TL;DR

SubTrack++ introduces a novel gradient subspace tracking method using Grassmannian geometry and recovery scaling, significantly reducing training time for large language models without increasing memory use.

Contribution

It presents a new approach combining Grassmannian subspace tracking with projection-aware optimizers and recovery scaling to improve LLM training efficiency.

Findings

01

Achieves up to 65% reduction in pre-training wall-time.

02

Reduces fine-tuning time by 36%.

03

Maintains the same memory footprint as existing methods.

Abstract

Training large language models (LLMs) is highly resource-intensive due to their massive number of parameters and the overhead of optimizer states. While recent work has aimed to reduce memory consumption, such efforts often entail trade-offs among memory efficiency, training time, and model performance. Yet, true democratization of LLMs requires simultaneous progress across all three dimensions. To this end, we propose SubTrack++ that leverages Grassmannian gradient subspace tracking combined with projection-aware optimizers, enabling Adam's internal statistics to adapt to subspace changes. Additionally, employing recovery scaling, a technique that restores information lost through low-rank projections, further enhances model performance. Our method demonstrates SOTA convergence by exploiting Grassmannian geometry, reducing pre-training wall-time by up to 65% and fine-tuning time by 36%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SubTrack++ : Gradient Subspace Tracking for Scalable LLM Training· slideslive

Taxonomy

TopicsTarget Tracking and Data Fusion in Sensor Networks · Neural Networks and Applications · EEG and Brain-Computer Interfaces