Cross-Architecture Distillation Made Simple with Redundancy Suppression

Weijia Zhang; Yuehao Liu; Wu Ran; Chao Ma

arXiv:2507.21844·cs.CV·July 30, 2025

Cross-Architecture Distillation Made Simple with Redundancy Suppression

Weijia Zhang, Yuehao Liu, Wu Ran, Chao Ma

PDF

TL;DR

This paper introduces a simple, efficient method for cross-architecture knowledge distillation that suppresses redundant information, outperforming complex existing methods like OFA with fewer parameters and broad applicability.

Contribution

The authors propose a novel redundancy suppression distillation (RSD) loss that simplifies cross-architecture knowledge transfer without architecture-specific modules.

Findings

01

Outperforms OFA on CIFAR-100 and ImageNet-1k benchmarks

02

Uses fewer parameters than existing methods

03

Provides a simple, effective baseline for cross-architecture distillation

Abstract

We describe a simple method for cross-architecture knowledge distillation, where the knowledge transfer is cast into a redundant information suppression formulation. Existing methods introduce sophisticated modules, architecture-tailored designs, and excessive parameters, which impair their efficiency and applicability. We propose to extract the architecture-agnostic knowledge in heterogeneous representations by reducing the redundant architecture-exclusive information. To this end, we present a simple redundancy suppression distillation (RSD) loss, which comprises cross-architecture invariance maximisation and feature decorrelation objectives. To prevent the student from entirely losing its architecture-specific capabilities, we further design a lightweight module that decouples the RSD objective from the student's internal representations. Our method is devoid of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.