Subspace Geometry Governs Catastrophic Forgetting in Low-Rank Adaptation

Brady Steele

arXiv:2603.02224·cs.LG·March 4, 2026

Subspace Geometry Governs Catastrophic Forgetting in Low-Rank Adaptation

Brady Steele

PDF

Open Access

TL;DR

This paper develops a geometric theory explaining catastrophic forgetting in Low-Rank Adaptation (LoRA) models, linking forgetting to the angle between task gradient subspaces and showing how orthogonality influences continual learning performance.

Contribution

It introduces a geometric law governing forgetting based on subspace angles, revealing rank-invariance properties and clarifying when rank impacts forgetting in LoRA models.

Findings

01

Forgetting correlates strongly with the minimum principal angle between task subspaces.

02

High subspace angles lead to rank-invariant forgetting, independent of adapter rank.

03

Orthogonal methods like O-LoRA perform well when natural orthogonality is high.

Abstract

Low-Rank Adaptation (LoRA) has emerged as a parameter-efficient approach for adapting large pre-trained models, yet its behavior under continual learning remains poorly understood. We present a geometric theory characterizing catastrophic forgetting in LoRA through the lens of gradient subspace interactions. Our central finding is that forgetting is governed by a simple geometric law: $F = α (1 - cos^{2} θ_{m i n}) + β$ , where $θ_{m i n}$ is the minimum principal angle between task gradient subspaces. This formulation reveals an approximate rank-invariance property, at high subspace angles, forgetting becomes largely independent of the adapter rank (coefficient of variation $\approx 0.8%$ in controlled synthetic settings; CV $\approx 10$ - $19%$ on real benchmarks, suggesting this is regime-dependent rather than absolute). We validate our theory on synthetic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Face recognition and analysis