Channel Self-Supervision for Online Knowledge Distillation
Shixiao Fan, Xuan Cheng, Xiaomin Wang, Chun Yang, Pan Deng, Minghui, Liu, Jiali Deng, Ming Liu

TL;DR
This paper introduces CSS, a novel online knowledge distillation method that enhances diversity among peer models through self-supervision and a dual-network structure, leading to improved performance and generalization.
Contribution
CSS innovatively combines self-supervised learning with a dual-network structure to address homogeneity in online distillation, improving diversity and effectiveness.
Findings
Outperforms OKDDip in diversity and accuracy on CIFAR-100.
Achieves state-of-the-art results on fine-grained datasets.
Demonstrates strong generalization across multiple datasets.
Abstract
Recently, researchers have shown an increased interest in the online knowledge distillation. Adopting an one-stage and end-to-end training fashion, online knowledge distillation uses aggregated intermediated predictions of multiple peer models for training. However, the absence of a powerful teacher model may result in the homogeneity problem between group peers, affecting the effectiveness of group distillation adversely. In this paper, we propose a novel online knowledge distillation method, \textbf{C}hannel \textbf{S}elf-\textbf{S}upervision for Online Knowledge Distillation (CSS), which structures diversity in terms of input, target, and network to alleviate the homogenization problem. Specifically, we construct a dual-network multi-branch structure and enhance inter-branch diversity through self-supervised learning, adopting the feature-level transformation and augmenting the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Machine Learning and ELM
MethodsKnowledge Distillation
