Do Not Forget to Attend to Uncertainty while Mitigating Catastrophic   Forgetting

Vinod K Kurmi; Badri N. Patro; Venkatesh K. Subramanian; Vinay P.; Namboodiri

arXiv:2102.01906·cs.LG·February 4, 2021

Do Not Forget to Attend to Uncertainty while Mitigating Catastrophic Forgetting

Vinod K Kurmi, Badri N. Patro, Venkatesh K. Subramanian, Vinay P., Namboodiri

PDF

Open Access

TL;DR

This paper introduces a Bayesian and self-attention based approach to mitigate catastrophic forgetting in incremental learning by leveraging uncertainty estimation, leading to improved accuracy on benchmarks.

Contribution

It proposes a novel method combining Bayesian uncertainty and self-attention for incremental learning, addressing limitations of existing knowledge distillation techniques.

Findings

01

Improved accuracy on standard benchmarks.

02

Effective use of uncertainty in distillation losses.

03

Ablation studies validating the approach.

Abstract

One of the major limitations of deep learning models is that they face catastrophic forgetting in an incremental learning scenario. There have been several approaches proposed to tackle the problem of incremental learning. Most of these methods are based on knowledge distillation and do not adequately utilize the information provided by older task models, such as uncertainty estimation in predictions. The predictive uncertainty provides the distributional information can be applied to mitigate catastrophic forgetting in a deep learning framework. In the proposed work, we consider a Bayesian formulation to obtain the data and model uncertainties. We also incorporate self-attention framework to address the incremental learning problem. We define distillation losses in terms of aleatoric uncertainty and self-attention. In the proposed work, we investigate different ablation analyses on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications

MethodsKnowledge Distillation