Fine-Grained Gradient Restriction: A Simple Approach for Mitigating Catastrophic Forgetting
Bo Liu, Mao Ye, Peter Stone, Qiang Liu

TL;DR
This paper introduces a simple gradient restriction method to improve continual learning by better balancing the retention of old knowledge and learning new tasks, outperforming existing approaches.
Contribution
It analyzes the hyper-parameter 'memory strength' in GEM, revealing its role in generalization, and proposes two flexible constraint methods for better trade-offs in continual learning.
Findings
Memory strength enhances GEM's generalization ability.
Proposed methods outperform GEM in balancing old and new knowledge.
Efficient optimization approach for constrained updates.
Abstract
A fundamental challenge in continual learning is to balance the trade-off between learning new tasks and remembering the previously acquired knowledge. Gradient Episodic Memory (GEM) achieves this balance by utilizing a subset of past training samples to restrict the update direction of the model parameters. In this work, we start by analyzing an often overlooked hyper-parameter in GEM, the memory strength, which boosts the empirical performance by further constraining the update direction. We show that memory strength is effective mainly because it improves GEM's generalization ability and therefore leads to a more favorable trade-off. By this finding, we propose two approaches that more flexibly constrain the update direction. Our methods are able to achieve uniformly better Pareto Frontiers of remembering old and learning new knowledge than using memory strength. We further propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning · Cell Image Analysis Techniques
