TL;DR
This paper introduces Contrastive Supervised Distillation (CSD), a novel training method for continual visual representation learning that reduces feature forgetting and enhances discriminative features by leveraging label information in a contrastive distillation framework.
Contribution
The paper presents a new contrastive supervised distillation approach that effectively mitigates catastrophic forgetting in continual learning for visual search tasks.
Findings
CSD outperforms existing methods in reducing catastrophic forgetting.
Feature forgetting in visual retrieval is less severe than in classification tasks.
CSD improves discriminative feature learning in continual settings.
Abstract
In this paper, we propose a novel training procedure for the continual representation learning problem in which a neural network model is sequentially learned to alleviate catastrophic forgetting in visual search tasks. Our method, called Contrastive Supervised Distillation (CSD), reduces feature forgetting while learning discriminative features. This is achieved by leveraging labels information in a distillation setting in which the student model is contrastively learned from the teacher model. Extensive experiments show that CSD performs favorably in mitigating catastrophic forgetting by outperforming current state-of-the-art methods. Our results also provide further evidence that feature forgetting evaluated in visual retrieval tasks is not as catastrophic as in classification tasks. Code at: https://github.com/NiccoBiondi/ContrastiveSupervisedDistillation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
