Guided Diffusion from Self-Supervised Diffusion Features
Vincent Tao Hu, Yunlu Chen, Mathilde Caron, Yuki M. Asano, Cees G. M., Snoek, Bjorn Ommer

TL;DR
This paper introduces a novel framework for extracting guidance directly from diffusion models, improving downstream task performance without relying on external classifiers or annotations.
Contribution
It proposes a method to derive guidance from diffusion models themselves, enhancing feature discriminability and sampling efficiency, with superior results on large-scale datasets.
Findings
Guidance from diffusion models matches class-conditioned models.
Feature regularization with Sinkhorn-Knopp improves discriminability.
Online training enables concurrent guidance extraction.
Abstract
Guidance serves as a key concept in diffusion models, yet its effectiveness is often limited by the need for extra data annotation or classifier pretraining. That is why guidance was harnessed from self-supervised learning backbones, like DINO. However, recent studies have revealed that the feature representation derived from diffusion model itself is discriminative for numerous downstream tasks as well, which prompts us to propose a framework to extract guidance from, and specifically for, diffusion models. Our research has yielded several significant contributions. Firstly, the guidance signals from diffusion models are on par with those from class-conditioned diffusion models. Secondly, feature regularization, when based on the Sinkhorn-Knopp algorithm, can further enhance feature discriminability in comparison to unconditional diffusion models. Thirdly, we have constructed an online…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Domain Adaptation and Few-Shot Learning
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Layer Normalization · Softmax · Residual Connection · Vision Transformer · self-DIstillation with NO labels · Diffusion
