Teacher-Feature Drifting: One-Step Diffusion Distillation with Pretrained Diffusion Representations
Yuan Zhang, Chenyi Li, Guoqing Ma, Jiajun Zha, Yuanming Yang, Bo Wang, Wei Tang, Wenbo Li, Haoyang Huang, and Nan Duan

TL;DR
This paper introduces a simplified one-step diffusion distillation method using pretrained diffusion models' internal states, achieving high-quality, diverse image generation with fewer passes.
Contribution
It leverages the pretrained teacher’s hidden states as features, removing the need for extra networks and simplifying the distillation process.
Findings
Achieves FID of 1.58 on ImageNet-64×64.
Attains FID of 18.4 on SDXL.
Demonstrates efficient one-step generation with competitive quality and diversity.
Abstract
Sampling from pretrained diffusion and flow-matching models typically requires many forward passes to generate diverse and high-fidelity images. Existing distillation methods often rely on multiple auxiliary networks, carefully designed training stages, or complex optimization pipelines. In this work, we revisit the recently proposed Drifting Model objective and show that a single drifting loss can be directly used to simplify one step distillation. A key observation is that the pretrained diffusion teacher itself already provides a strong representation space. Unlike the original Drifting Model, which relies on an additional pretrained feature extractor, we use intermediate hidden states of the pretrained teacher model as the feature representation. This removes the need for training or introducing an extra representation network while preserving a semantically meaningful feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
