Catastrophic Forgetting Mitigation Through Plateau Phase Activity Profiling
Idan Mashiach, Oren Glickman, and Tom Tirer

TL;DR
This paper introduces a novel method for mitigating catastrophic forgetting in deep neural networks by profiling parameter activity during the training plateau, leading to improved retention of previous knowledge while learning new tasks.
Contribution
The paper proposes tracking parameter activity during the training plateau as a new approach to identify flat directions for better knowledge preservation.
Findings
Outperforms existing regularization methods in balancing forgetting and learning new tasks
Parameters active during the plateau indicate flatter loss landscape directions
Achieves superior results in continual learning benchmarks
Abstract
Catastrophic forgetting in deep neural networks occurs when learning new tasks degrades performance on previously learned tasks due to knowledge overwriting. Among the approaches to mitigate this issue, regularization techniques aim to identify and constrain "important" parameters to preserve previous knowledge. In the highly nonconvex optimization landscape of deep learning, we propose a novel perspective: tracking parameters during the final training plateau is more effective than monitoring them throughout the entire training process. We argue that parameters that exhibit higher activity (movement and variability) during this plateau reveal directions in the loss landscape that are relatively flat, making them suitable for adaptation to new tasks while preserving knowledge from previous ones. Our comprehensive experiments demonstrate that this approach achieves superior performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
