Diffusion Model with Cross Attention as an Inductive Bias for   Disentanglement

Tao Yang; Cuiling Lan; Yan Lu; Nanning zheng

arXiv:2402.09712·cs.CV·June 13, 2024·2 cites

Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement

Tao Yang, Cuiling Lan, Yan Lu, Nanning zheng

PDF

Open Access 1 Video

TL;DR

This paper demonstrates that diffusion models with cross-attention can effectively learn disentangled representations without additional regularization, surpassing previous methods on benchmark datasets.

Contribution

It introduces a novel framework using diffusion models with cross-attention as an inductive bias for unsupervised disentangled representation learning, requiring no complex regularization.

Findings

01

Achieves superior disentanglement performance on benchmarks.

02

No additional regularization needed for effective disentanglement.

03

Provides insights through ablation studies and visualization.

Abstract

Disentangled representation learning strives to extract the intrinsic factors within observed data. Factorizing these representations in an unsupervised manner is notably challenging and usually requires tailored loss functions or specific structural designs. In this paper, we introduce a new perspective and framework, demonstrating that diffusion models with cross-attention can serve as a powerful inductive bias to facilitate the learning of disentangled representations. We propose to encode an image to a set of concept tokens and treat them as the condition of the latent diffusion for image reconstruction, where cross-attention over the concept tokens is used to bridge the interaction between the encoder and diffusion. Without any additional regularization, this framework achieves superior disentanglement performance on the benchmark datasets, surpassing all previous methods with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement· slideslive

Taxonomy

TopicsAdvanced Statistical Modeling Techniques · Statistical Mechanics and Entropy · Opinion Dynamics and Social Influence

MethodsSparse Evolutionary Training · Diffusion