Provable Continual Learning via Sketched Jacobian Approximations
Reinhard Heckel

TL;DR
This paper introduces a Jacobian sketching method for continual learning that provably prevents catastrophic forgetting in linear models and wide neural networks, offering insights into regularization effectiveness and memory trade-offs.
Contribution
It proposes a novel Jacobian sketching regularization technique that overcomes limitations of existing methods like EWC in continual learning scenarios.
Findings
Jacobian sketching prevents catastrophic forgetting in linear models.
The method extends to wide neural networks with provable guarantees.
Provides analysis of regularization effectiveness and memory costs.
Abstract
An important problem in machine learning is the ability to learn tasks in a sequential manner. If trained with standard first-order methods most models forget previously learned tasks when trained on a new task, which is often referred to as catastrophic forgetting. A popular approach to overcome forgetting is to regularize the loss function by penalizing models that perform poorly on previous tasks. For example, elastic weight consolidation (EWC) regularizes with a quadratic form involving a diagonal matrix build based on past data. While EWC works very well for some setups, we show that, even under otherwise ideal conditions, it can provably suffer catastrophic forgetting if the diagonal matrix is a poor approximation of the Hessian matrix of previous tasks. We propose a simple approach to overcome this: Regularizing training of a new task with sketches of the Jacobian matrix of past…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsElastic Weight Consolidation
