Contrastive Representation Distillation via Multi-Scale Feature Decoupling

Cuipeng Wang; Haipeng Wang

arXiv:2502.05835·cs.CV·October 14, 2025

Contrastive Representation Distillation via Multi-Scale Feature Decoupling

Cuipeng Wang, Haipeng Wang

PDF

Open Access

TL;DR

This paper introduces MSDCRD, a novel distillation framework that decouples features into multi-scale local components and uses contrastive learning for efficient, high-performance knowledge transfer without external memory.

Contribution

It proposes a model-agnostic method that decouples features into multi-scale local parts and employs contrastive losses, improving distillation efficiency and effectiveness.

Findings

01

Achieves superior performance in homogeneous and heterogeneous settings.

02

Eliminates the need for external memory buffers.

03

Demonstrates strong generalization across architectures.

Abstract

Knowledge distillation enhances the performance of compact student networks by transferring knowledge from more powerful teacher networks without introducing additional parameters. In the feature space, local regions within an individual global feature encode distinct yet interdependent semantic information. Previous feature-based distillation methods mainly emphasize global feature alignment while neglecting the decoupling of local regions within an individual global feature, which often results in semantic confusion and suboptimal performance. Moreover, conventional contrastive representation distillation suffers from low efficiency due to its reliance on a large memory buffer to store feature samples. To address these limitations, this work proposes MSDCRD, a model-agnostic distillation framework that systematically decouples global features into multi-scale local features and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems · Face and Expression Recognition · Neural Networks and Applications

MethodsFocus