Multi-Mode Online Knowledge Distillation for Self-Supervised Visual Representation Learning
Kaiyou Song, Jin Xie, Shan Zhang, Zimeng Luo

TL;DR
This paper introduces MOKD, a novel multi-mode online knowledge distillation framework that enables two models to collaboratively improve self-supervised visual representations through self- and cross-distillation modes.
Contribution
The paper proposes a new MOKD method with self- and cross-distillation modes, including a cross-attention strategy, to enhance SSL performance without static teachers.
Findings
MOKD outperforms baseline models on various datasets.
Heterogeneous models benefit from MOKD in representation quality.
MOKD surpasses existing SSL-KD methods for both student and teacher models.
Abstract
Self-supervised learning (SSL) has made remarkable progress in visual representation learning. Some studies combine SSL with knowledge distillation (SSL-KD) to boost the representation learning performance of small models. In this study, we propose a Multi-mode Online Knowledge Distillation method (MOKD) to boost self-supervised visual representation learning. Different from existing SSL-KD methods that transfer knowledge from a static pre-trained teacher to a student, in MOKD, two different models learn collaboratively in a self-supervised manner. Specifically, MOKD consists of two distillation modes: self-distillation and cross-distillation modes. Among them, self-distillation performs self-supervised learning for each model independently, while cross-distillation realizes knowledge interaction between different models. In cross-distillation, a cross-attention feature search strategy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Digital Imaging for Blood Diseases
MethodsKnowledge Distillation
