Rectification-based Knowledge Retention for Continual Learning

Pravendra Singh; Pratik Mazumder; Piyush Rai; Vinay P. Namboodiri

arXiv:2103.16597·cs.CV·April 1, 2021

Rectification-based Knowledge Retention for Continual Learning

Pravendra Singh, Pratik Mazumder, Piyush Rai, Vinay P. Namboodiri

PDF

TL;DR

This paper introduces a novel method for continual learning that uses weight rectifications and affine transformations to adapt models to new tasks, effectively reducing catastrophic forgetting in both zero-shot and non-zero-shot settings.

Contribution

The proposed approach employs weight rectifications and affine transformations with few parameters, achieving state-of-the-art results in task incremental learning and generalized zero-shot learning.

Findings

01

Outperforms existing methods by over 5% on CIFAR-100.

02

Achieves 6.91% and 6.33% improvements on AWA1 and CUB datasets.

03

Validated through extensive ablation studies.

Abstract

Deep learning models suffer from catastrophic forgetting when trained in an incremental learning setting. In this work, we propose a novel approach to address the task incremental learning problem, which involves training a model on new tasks that arrive in an incremental manner. The task incremental learning problem becomes even more challenging when the test set contains classes that are not part of the train set, i.e., a task incremental generalized zero-shot learning problem. Our approach can be used in both the zero-shot and non zero-shot task incremental learning settings. Our proposed method uses weight rectifications and affine transformations in order to adapt the model to different tasks that arrive sequentially. Specifically, we adapt the network weights to work for new tasks by "rectifying" the weights learned from the previous task. We learn these weight rectifications…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.