Forgetting of task-specific knowledge in model merging-based continual learning
Timm Hess, Gido M van de Ven, Tinne Tuytelaars

TL;DR
This study examines how model merging in continual learning affects shared and task-specific knowledge, revealing that shared knowledge is retained while task-specific knowledge degrades, with incremental training merging outperforming parallel training.
Contribution
It provides new insights into the effects of model merging on shared versus task-specific knowledge in continual learning, highlighting the benefits of incremental training merging.
Findings
Shared knowledge is preserved or enhanced after merging.
Task-specific knowledge rapidly degrades during merging.
Incremental training merging outperforms parallel training merging.
Abstract
This paper investigates the linear merging of models in the context of continual learning (CL). Using controlled visual cues in computer vision experiments, we demonstrate that merging largely preserves or enhances shared knowledge, while unshared task-specific knowledge rapidly degrades. We further find that merging models from an incremental training process consistently outperforms merging models trained in parallel.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Higher Education Learning Practices · AI-based Problem Solving and Planning
