OASIS: Online Activation Subspace Learning for Memory-Efficient Training
Sakshi Choudhary, Utkarsh Saxena, Kaushik Roy

TL;DR
OASIS introduces an online learning algorithm that dynamically tracks a low-dimensional activation subspace during training, significantly reducing memory usage while maintaining performance.
Contribution
It proposes a novel method for memory-efficient training by continuously updating an activation subspace, enabling low-rank gradient and optimizer representations without altering forward computations.
Findings
Achieves up to 2x lower peak memory during training.
Maintains performance comparable to full fine-tuning.
Outperforms prior low-rank methods in various tasks.
Abstract
Training large language models (LLMs) is constrained by memory requirements, with activations accounting for a substantial fraction of the total footprint. Existing approaches reduce memory using low-rank weight parameterizations or low-rank gradient subspaces for optimizer states, while activation memory is addressed through architectural modifications or compression schemes based on periodically updated projections. We propose OASIS, an online activation subspace learning algorithm for memory-efficient training that tracks and continuously updates a low-dimensional activation subspace during training. Intermediate activations are projected onto this evolving subspace, reducing memory without modifying forward-pass computations. The evolving activation subspace induces low-rank gradient representations, enabling both gradients and optimizer states to be maintained directly in this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
