Elastic Weight Consolidation Done Right for Continual Learning
Xuan Liu, Xiaobin Chang

TL;DR
This paper identifies fundamental issues in the importance estimation of Elastic Weight Consolidation (EWC) for continual learning and proposes a simple modification, Logits Reversal, that significantly improves its performance across various tasks.
Contribution
The paper systematically analyzes EWC's importance estimation flaws and introduces Logits Reversal to correct these issues, leading to superior continual learning results.
Findings
EWC's reliance on Fisher Information causes gradient vanishing.
Memory Aware Synapses impose redundant constraints on irrelevant parameters.
Logits Reversal effectively improves EWC's importance estimation and performance.
Abstract
Weight regularization methods in continual learning (CL) alleviate catastrophic forgetting by assessing and penalizing changes to important model weights. Elastic Weight Consolidation (EWC) is a foundational and widely used approach within this framework that estimates weight importance based on gradients. However, it has consistently shown suboptimal performance. In this paper, we conduct a systematic analysis of importance estimation in EWC from a gradient-based perspective. For the first time, we find that EWC's reliance on the Fisher Information Matrix (FIM) results in gradient vanishing and inaccurate importance estimation in certain scenarios. Our analysis also reveals that Memory Aware Synapses (MAS), a variant of EWC, imposes unnecessary constraints on parameters irrelevant to prior tasks, termed the redundant protection. Consequently, both EWC and its variants exhibit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Face recognition and analysis · Advanced Neural Network Applications
