OASIS: Online Activation Subspace Learning for Memory-Efficient Training

Sakshi Choudhary; Utkarsh Saxena; Kaushik Roy

arXiv:2604.09406·cs.LG·April 13, 2026

OASIS: Online Activation Subspace Learning for Memory-Efficient Training

Sakshi Choudhary, Utkarsh Saxena, Kaushik Roy

PDF

TL;DR

OASIS introduces an online learning algorithm that dynamically tracks a low-dimensional activation subspace during training, significantly reducing memory usage while maintaining performance.

Contribution

It proposes a novel method for memory-efficient training by continuously updating an activation subspace, enabling low-rank gradient and optimizer representations without altering forward computations.

Findings

01

Achieves up to 2x lower peak memory during training.

02

Maintains performance comparable to full fine-tuning.

03

Outperforms prior low-rank methods in various tasks.

Abstract

Training large language models (LLMs) is constrained by memory requirements, with activations accounting for a substantial fraction of the total footprint. Existing approaches reduce memory using low-rank weight parameterizations or low-rank gradient subspaces for optimizer states, while activation memory is addressed through architectural modifications or compression schemes based on periodically updated projections. We propose OASIS, an online activation subspace learning algorithm for memory-efficient training that tracks and continuously updates a low-dimensional activation subspace during training. Intermediate activations are projected onto this evolving subspace, reducing memory without modifying forward-pass computations. The evolving activation subspace induces low-rank gradient representations, enabling both gradients and optimizer states to be maintained directly in this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.