ELROND: Exploring and decomposing intrinsic capabilities of diffusion models
Pawe{\l} Skier\'s, Tomasz Trzci\'nski, Kamil Deja

TL;DR
This paper introduces a framework to analyze and control the semantic variations in diffusion model outputs by disentangling underlying generative directions, improving interpretability and diversity.
Contribution
It proposes a method to identify and manipulate semantic directions in diffusion models' input space, enhancing control and diversity of generated images.
Findings
Isolates interpretable, steerable directions for concept control
Reduces mode collapse and restores diversity in models
Provides a new estimator for concept complexity based on subspace dimensionality
Abstract
A single text prompt passed to a diffusion model often yields a wide range of visual outputs determined solely by stochastic process, leaving users with no direct control over which specific semantic variations appear in the image. While existing unsupervised methods attempt to analyze these variations via output features, they omit the underlying generative process. In this work, we propose a framework to disentangle these semantic directions directly within the input embedding space. To that end, we collect a set of gradients obtained by backpropagating the differences between stochastic realizations of a fixed prompt that we later decompose into meaningful steering directions with either Principal Components Analysis or Sparse Autoencoder. Our approach yields three key contributions: (1) it isolates interpretable, steerable directions for precise, fine-grained control over a single…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face Recognition and Perception · Visual perception and processing mechanisms
