Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental   Learning

Ali Cheraghian; Shafin Rahman; Pengfei Fang; Soumava Kumar Roy; Lars; Petersson; Mehrtash Harandi

arXiv:2103.04059·cs.CV·April 1, 2021

Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning

Ali Cheraghian, Shafin Rahman, Pengfei Fang, Soumava Kumar Roy, Lars, Petersson, Mehrtash Harandi

PDF

TL;DR

This paper introduces a semantic-aware knowledge distillation method for few-shot class-incremental learning, leveraging semantic information and attention mechanisms to improve learning efficiency and reduce forgetting.

Contribution

It proposes a novel distillation algorithm that uses semantic information and attention mechanisms to enhance FSCIL performance, achieving state-of-the-art results.

Findings

01

Outperforms existing methods on MiniImageNet, CUB200, and CIFAR100 datasets.

02

Effectively reduces catastrophic forgetting in FSCIL.

03

Utilizes semantic word embeddings to facilitate learning with limited data.

Abstract

Few-shot class incremental learning (FSCIL) portrays the problem of learning new concepts gradually, where only a few examples per concept are available to the learner. Due to the limited number of examples for training, the techniques developed for standard incremental learning cannot be applied verbatim to FSCIL. In this work, we introduce a distillation algorithm to address the problem of FSCIL and propose to make use of semantic information during training. To this end, we make use of word embeddings as semantic information which is cheap to obtain and which facilitate the distillation process. Furthermore, we propose a method based on an attention mechanism on multiple parallel embeddings of visual data to align visual and semantic vectors, which reduces issues related to catastrophic forgetting. Via experiments on MiniImageNet, CUB200, and CIFAR100 dataset, we establish new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.