CERSA: Cumulative Energy-Retaining Subspace Adaptation for Memory-Efficient Fine-Tuning
Jingze Ge, Xue Geng, Yun Liu, Wanqi Dong, Wang Zhe Mark, Min Wu, Ngai-Man Cheung, Bharadwaj Veeravalli, Xulei Yang

TL;DR
CERSA is a novel fine-tuning method that uses SVD to retain principal components, significantly reducing memory use while outperforming existing PEFT techniques across various tasks.
Contribution
CERSA introduces a spectral energy-based subspace adaptation approach for memory-efficient fine-tuning of large models, surpassing current methods.
Findings
CERSA outperforms state-of-the-art PEFT methods in accuracy.
CERSA reduces memory consumption substantially.
CERSA is effective across multiple domains and model scales.
Abstract
To mitigate the memory constraints associated with fine-tuning large pre-trained models, existing parameter-efficient fine-tuning (PEFT) methods, such as LoRA, rely on low-rank updates. However, such updates fail to fully capture the rank characteristics of the weight modifications observed in full-parameter fine-tuning, resulting in a performance gap. Furthermore, LoRA and other existing PEFT methods still require substantial memory to store the full set of frozen weights, limiting their efficiency in resource-constrained settings. To addres these limitations, we introduce Cumulative Energy-Retaining Subspace Adaptation (CERSA), a novel fine-tuning paradigm that leverages singular value decomposition (SVD) to retain only the principal components responsible for 90% to 95% of the spectral energy. By fine-tuning low-rank representations derived from this principal subspace, CERSA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
