How do Quadratic Regularizers Prevent Catastrophic Forgetting: The Role   of Interpolation

Ekdeep Singh Lubana; Puja Trivedi; Danai Koutra; Robert P. Dick

arXiv:2102.02805·cs.LG·August 16, 2022·1 cites

How do Quadratic Regularizers Prevent Catastrophic Forgetting: The Role of Interpolation

Ekdeep Singh Lubana, Puja Trivedi, Danai Koutra, Robert P. Dick

PDF

Open Access 2 Repos

TL;DR

This paper explains how quadratic regularizers prevent catastrophic forgetting in neural networks by interpolating model parameters, and proposes modifications to improve their stability and effectiveness, validated through extensive experiments.

Contribution

It provides a detailed explanation of quadratic regularizers' role in preventing forgetting and introduces a simple modification to enhance their performance and stability.

Findings

01

Quadratic regularizers interpolate parameters to prevent forgetting.

02

The modification improves accuracy by 6.2% and reduces forgetting by 4.5%.

03

Results are validated across 2000 models in various settings.

Abstract

Catastrophic forgetting undermines the effectiveness of deep neural networks (DNNs) in scenarios such as continual learning and lifelong learning. While several methods have been proposed to tackle this problem, there is limited work explaining why these methods work well. This paper has the goal of better explaining a popularly used technique for avoiding catastrophic forgetting: quadratic regularization. We show that quadratic regularizers prevent forgetting of past tasks by interpolating current and previous values of model parameters at every training iteration. Over multiple training iterations, this interpolation operation reduces the learning rates of more important model parameters, thereby minimizing their movement. Our analysis also reveals two drawbacks of quadratic regularization: (a) dependence of parameter interpolation on training hyperparameters, which often leads to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications