DreamTeacher: Pretraining Image Backbones with Deep Generative Models
Daiqing Li, Huan Ling, Amlan Kar, David Acuna, Seung Wook Kim, Karsten, Kreis, Antonio Torralba, Sanja Fidler

TL;DR
DreamTeacher introduces a self-supervised framework that distills knowledge from generative models into image backbones, achieving superior representation learning without manual annotations.
Contribution
It presents a novel knowledge distillation method from generative models to pre-train image backbones, outperforming existing self-supervised approaches.
Findings
DreamTeacher outperforms existing self-supervised methods.
Unsupervised ImageNet pre-training improves downstream performance.
Diffusion generative models are effective for representation learning.
Abstract
In this work, we introduce a self-supervised feature representation learning framework DreamTeacher that utilizes generative networks for pre-training downstream image backbones. We propose to distill knowledge from a trained generative model into standard image backbones that have been well engineered for specific perception tasks. We investigate two types of knowledge distillation: 1) distilling learned generative features onto target image backbones as an alternative to pretraining these backbones on large labeled datasets such as ImageNet, and 2) distilling labels obtained from generative networks with task heads onto logits of target backbones. We perform extensive analyses on multiple generative models, dense prediction benchmarks, and several pre-training regimes. We empirically find that our DreamTeacher significantly outperforms existing self-supervised representation learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Music and Audio Processing · Domain Adaptation and Few-Shot Learning
MethodsDiffusion
