When Data-Free Knowledge Distillation Meets Non-Transferable Teacher: Escaping Out-of-Distribution Trap is All You Need

Ziming Hong; Runnan Chen; Zengmao Wang; Bo Han; Bo Du; Tongliang Liu

arXiv:2507.04119·cs.LG·July 8, 2025

When Data-Free Knowledge Distillation Meets Non-Transferable Teacher: Escaping Out-of-Distribution Trap is All You Need

Ziming Hong, Runnan Chen, Zengmao Wang, Bo Han, Bo Du, Tongliang Liu

PDF

TL;DR

This paper addresses the challenge of data-free knowledge distillation with untrusted teachers, proposing a method to filter out out-of-distribution samples and improve the robustness of the student model.

Contribution

It introduces Adversarial Trap Escaping (ATEsc), a novel approach to identify and filter OOD-like synthetic samples during data-free knowledge distillation from non-transferable teachers.

Findings

01

ATEsc effectively filters OOD-like samples, enhancing knowledge transfer.

02

The method improves robustness of DFKD against untrusted teachers.

03

Experimental results show significant performance gains.

Abstract

Data-free knowledge distillation (DFKD) transfers knowledge from a teacher to a student without access the real in-distribution (ID) data. Its common solution is to use a generator to synthesize fake data and use them as a substitute for real ID data. However, existing works typically assume teachers are trustworthy, leaving the robustness and security of DFKD from untrusted teachers largely unexplored. In this work, we conduct the first investigation into distilling non-transferable learning (NTL) teachers using DFKD, where the transferability from an ID domain to an out-of-distribution (OOD) domain is prohibited. We find that NTL teachers fool DFKD through divert the generator's attention from the useful ID knowledge to the misleading OOD knowledge. This hinders ID knowledge transfer but prioritizes OOD knowledge transfer. To mitigate this issue, we propose Adversarial Trap Escaping…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.