Subspace Geometry Governs Catastrophic Forgetting in Low-Rank Adaptation
Brady Steele

TL;DR
This paper develops a geometric theory explaining catastrophic forgetting in Low-Rank Adaptation (LoRA) models, linking forgetting to the angle between task gradient subspaces and showing how orthogonality influences continual learning performance.
Contribution
It introduces a geometric law governing forgetting based on subspace angles, revealing rank-invariance properties and clarifying when rank impacts forgetting in LoRA models.
Findings
Forgetting correlates strongly with the minimum principal angle between task subspaces.
High subspace angles lead to rank-invariant forgetting, independent of adapter rank.
Orthogonal methods like O-LoRA perform well when natural orthogonality is high.
Abstract
Low-Rank Adaptation (LoRA) has emerged as a parameter-efficient approach for adapting large pre-trained models, yet its behavior under continual learning remains poorly understood. We present a geometric theory characterizing catastrophic forgetting in LoRA through the lens of gradient subspace interactions. Our central finding is that forgetting is governed by a simple geometric law: , where is the minimum principal angle between task gradient subspaces. This formulation reveals an approximate rank-invariance property, at high subspace angles, forgetting becomes largely independent of the adapter rank (coefficient of variation in controlled synthetic settings; CV - on real benchmarks, suggesting this is regime-dependent rather than absolute). We validate our theory on synthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Face recognition and analysis
