AICSD: Adaptive Inter-Class Similarity Distillation for Semantic Segmentation
Amir M. Mansourian, Rozhan Ahmadi, Shohreh Kasaei

TL;DR
This paper introduces AICSD, a novel knowledge distillation method for semantic segmentation that captures inter-class relations and adaptively reduces the teacher's influence during training, leading to improved accuracy.
Contribution
It proposes Inter-Class Similarity Distillation with an Adaptive Loss Weighting strategy to enhance lightweight network training for dense prediction tasks.
Findings
Outperforms existing distillation methods on Cityscapes and Pascal VOC 2012 datasets.
Improves mIoU and pixel accuracy significantly.
Effective in transferring high-order class relations.
Abstract
In recent years, deep neural networks have achieved remarkable accuracy in computer vision tasks. With inference time being a crucial factor, particularly in dense prediction tasks such as semantic segmentation, knowledge distillation has emerged as a successful technique for improving the accuracy of lightweight student networks. The existing methods often neglect the information in channels and among different classes. To overcome these limitations, this paper proposes a novel method called Inter-Class Similarity Distillation (ICSD) for the purpose of knowledge distillation. The proposed method transfers high-order relations from the teacher network to the student network by independently computing intra-class distributions for each class from network outputs. This is followed by calculating inter-class similarity matrices for distillation using KL divergence between distributions of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsKnowledge Distillation · Adaptive Robust Loss
