Normalized Feature Distillation for Semantic Segmentation
Tao Liu, Xi Yang, Chenshu Chen

TL;DR
This paper introduces a normalized feature distillation method for semantic segmentation that simplifies knowledge transfer by focusing on feature normalization, leading to state-of-the-art results without complex manual design.
Contribution
Proposes a simple normalized feature distillation approach that improves semantic segmentation performance without elaborate manual knowledge design.
Findings
Achieves state-of-the-art results on Cityscapes, VOC 2012, and ADE20K datasets.
Simplifies knowledge distillation by normalization, removing the need for manual feature design.
Effective for model compression in semantic segmentation.
Abstract
As a promising approach in model compression, knowledge distillation improves the performance of a compact model by transferring the knowledge from a cumbersome one. The kind of knowledge used to guide the training of the student is important. Previous distillation methods in semantic segmentation strive to extract various forms of knowledge from the features, which involve elaborate manual design relying on prior information and have limited performance gains. In this paper, we propose a simple yet effective feature distillation method called normalized feature distillation (NFD), aiming to enable effective distillation with the original features without the need to manually design new forms of knowledge. The key idea is to prevent the student from focusing on imitating the magnitude of the teacher's feature response by normalization. Our method achieves state-of-the-art distillation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Machine Learning and Data Classification
MethodsKnowledge Distillation
