Deep Dreams Are Made of This: Visualizing Monosemantic Features in Diffusion Models

Adam Szokalski; Mateusz Modrzejewski

arXiv:2605.08218·cs.LG·May 12, 2026

Deep Dreams Are Made of This: Visualizing Monosemantic Features in Diffusion Models

Adam Szokalski, Mateusz Modrzejewski

PDF

TL;DR

This paper introduces latent visualization by optimization (LVO), a technique extending feature visualization to latent diffusion models, enabling clear visualization of monosemantic features through disentangled autoencoder representations.

Contribution

The paper presents LVO, a novel interpretability method for diffusion models that disentangles features and visualizes monosemantic concepts in the latent space.

Findings

01

SAE features produce clear visualizations of recognizable concepts

02

Regularization techniques transfer from pixel-space to latent domain

03

LVO provides insights into feature activation mechanisms

Abstract

This paper proposes latent visualization by optimization (LVO), a mechanistic interpretability technique that extends feature visualization by optimization - originally developed for convolutional neural networks - to latent diffusion models. LVO employs sparse autoencoders (SAEs) to disentangle polysemantic layer representations into monosemantic features. Key contributions include latent-space optimization, time-step activity analysis, schedule-matched noise injection, prior initialization through feature steering, and suitable regularization strategies. We demonstrate the method on Stable Diffusion 1.5 fine-tuned on the Style50 dataset, showing that SAE features produce clear visualizations of recognizable concepts - including diagonal compositions, human figures, roses, cables, and waterfall foam - that correlate with dataset examples, while the baseline without disentanglement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.