Rethinking Momentum Knowledge Distillation in Online Continual Learning

Nicolas Michel; Maorong Wang; Ling Xiao; Toshihiko Yamasaki

arXiv:2309.02870·cs.LG·June 6, 2024·2 cites

Rethinking Momentum Knowledge Distillation in Online Continual Learning

Nicolas Michel, Maorong Wang, Ling Xiao, Toshihiko Yamasaki

PDF

Open Access 1 Repo

TL;DR

This paper explores the application of Momentum Knowledge Distillation (MKD) in Online Continual Learning (OCL), demonstrating significant accuracy improvements and providing insights into MKD's mechanics within OCL training.

Contribution

It introduces a novel methodology for applying MKD to OCL, significantly enhancing existing methods and analyzing MKD's internal mechanics in this context.

Findings

01

Improves state-of-the-art accuracy on ImageNet100 by over 10 percentage points.

02

Provides empirical analysis of MKD's impact during OCL training.

03

Demonstrates MKD as a central component in OCL methods.

Abstract

Online Continual Learning (OCL) addresses the problem of training neural networks on a continuous data stream where multiple classification tasks emerge in sequence. In contrast to offline Continual Learning, data can be seen only once in OCL, which is a very severe constraint. In this context, replay-based strategies have achieved impressive results and most state-of-the-art approaches heavily depend on them. While Knowledge Distillation (KD) has been extensively used in offline Continual Learning, it remains under-exploited in OCL, despite its high potential. In this paper, we analyze the challenges in applying KD to OCL and give empirical justifications. We introduce a direct yet effective methodology for applying Momentum Knowledge Distillation (MKD) to many flagship OCL methods and demonstrate its capabilities to enhance existing approaches. In addition to improving existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nicolas1203/mkd_ocl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Multimodal Machine Learning Applications

MethodsKnowledge Distillation