Channel Self-Supervision for Online Knowledge Distillation

Shixiao Fan; Xuan Cheng; Xiaomin Wang; Chun Yang; Pan Deng; Minghui; Liu; Jiali Deng; Ming Liu

arXiv:2203.11660·cs.CV·March 24, 2022

Channel Self-Supervision for Online Knowledge Distillation

Shixiao Fan, Xuan Cheng, Xiaomin Wang, Chun Yang, Pan Deng, Minghui, Liu, Jiali Deng, Ming Liu

PDF

Open Access

TL;DR

This paper introduces CSS, a novel online knowledge distillation method that enhances diversity among peer models through self-supervision and a dual-network structure, leading to improved performance and generalization.

Contribution

CSS innovatively combines self-supervised learning with a dual-network structure to address homogeneity in online distillation, improving diversity and effectiveness.

Findings

01

Outperforms OKDDip in diversity and accuracy on CIFAR-100.

02

Achieves state-of-the-art results on fine-grained datasets.

03

Demonstrates strong generalization across multiple datasets.

Abstract

Recently, researchers have shown an increased interest in the online knowledge distillation. Adopting an one-stage and end-to-end training fashion, online knowledge distillation uses aggregated intermediated predictions of multiple peer models for training. However, the absence of a powerful teacher model may result in the homogeneity problem between group peers, affecting the effectiveness of group distillation adversely. In this paper, we propose a novel online knowledge distillation method, \textbf{C}hannel \textbf{S}elf-\textbf{S}upervision for Online Knowledge Distillation (CSS), which structures diversity in terms of input, target, and network to alleviate the homogenization problem. Specifically, we construct a dual-network multi-branch structure and enhance inter-branch diversity through self-supervised learning, adopting the feature-level transformation and augmenting the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Machine Learning and ELM

MethodsKnowledge Distillation