When Does Structure Matter in Continual Learning? Dimensionality Controls When Modularity Shapes Representational Geometry
Kathrin Korte, Joachim Winter Pedersen, Eleni Nisioti, and Sebastian Risi

TL;DR
This paper investigates how network architecture, task similarity, and representational dimensionality influence the effectiveness of structural separation in continual learning, emphasizing the role of dimensionality in shaping representational geometry.
Contribution
It demonstrates that representational dimensionality determines when architectural modularity benefits continual learning, revealing graded task-specific subspace alignment based on task similarity.
Findings
High-dimensional regimes show minimal impact of architecture on interference.
Lower-dimensional regimes benefit from modular architectures with task-specific subspace separation.
Representational dimensionality governs the relevance of structural separation in continual learning.
Abstract
To preserve previously learned representations, continual learning systems must strike a balance between plasticity, the ability to acquire new knowledge, and stability. This stability-plasticity dilemma affects how representations can be reused across tasks: shared structure enables transfer when tasks are similar but may also induce interference when new learning disrupts existing representations. However, it remains unclear when and why structural separation influences this trade-off. In this study, we examine how network architecture, task similarity, and representational dimensionality jointly shape learning in a sequential task paradigm inspired by transfer-interference studies. We compare a task-partitioned modular recurrent network with a single-module baseline by systematically varying task similarity (low, medium, high) and the scale of weight initialization, which induces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
