Reducing Catastrophic Forgetting in Online Class Incremental Learning Using Self-Distillation
Kotaro Nagata, Hiromu Ono, Kazuhiro Hotta

TL;DR
This paper proposes a self-distillation approach to mitigate catastrophic forgetting in online class incremental learning, enhancing knowledge transfer and memory management for better performance on standard datasets.
Contribution
It introduces a novel self-distillation technique using a highly generalizable shallow layer output as a teacher, and a memory update strategy prioritizing misclassified samples.
Findings
Outperforms conventional methods on CIFAR10, CIFAR100, MiniImageNet
Improves knowledge transfer and memory efficiency
Reduces catastrophic forgetting in incremental learning
Abstract
In continual learning, there is a serious problem of catastrophic forgetting, in which previous knowledge is forgotten when a model learns new tasks. Various methods have been proposed to solve this problem. Replay methods which replay data from previous tasks in later training, have shown good accuracy. However, replay methods have a generalizability problem from a limited memory buffer. In this paper, we tried to solve this problem by acquiring transferable knowledge through self-distillation using highly generalizable output in shallow layer as a teacher. Furthermore, when we deal with a large number of classes or challenging data, there is a risk of learning not converging and not experiencing overfitting. Therefore, we attempted to achieve more efficient and thorough learning by prioritizing the storage of easily misclassified samples through a new method of memory update. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
