DreamReader: An Interpretability Toolkit for Text-to-Image Models

Nirmalendu Prakash; Narmeen Oozeer; Michael Lan; Luka Samkharadze; Phillip Howard; Roy Ka-Wei Lee; Dhruv Nathawani; Shivam Raval; Amirali Abdullah

arXiv:2603.13299·cs.LG·March 17, 2026

DreamReader: An Interpretability Toolkit for Text-to-Image Models

Nirmalendu Prakash, Narmeen Oozeer, Michael Lan, Luka Samkharadze, Phillip Howard, Roy Ka-Wei Lee, Dhruv Nathawani, Shivam Raval, Amirali Abdullah

PDF

Open Access

TL;DR

DreamReader is a comprehensive, model-agnostic toolkit that enables systematic interpretability and controllable interventions in text-to-image diffusion models, advancing understanding and manipulation of their internal representations.

Contribution

It introduces a unified framework with novel intervention primitives for diffusion models, facilitating systematic analysis and manipulation of internal representations.

Findings

01

Successful activation stitching between models

02

Effective LoReFT-based concept steering in image generation

03

Promising transferability of interpretability techniques from language models

Abstract

Despite the rapid adoption of text-to-image (T2I) diffusion models, causal and representation-level analysis remains fragmented and largely limited to isolated probing techniques. To address this gap, we introduce DreamReader: a unified framework that formalizes diffusion interpretability as composable representation operators spanning activation extraction, causal patching, structured ablations, and activation steering across modules and timesteps. DreamReader provides a model-agnostic abstraction layer enabling systematic analysis and intervention across diffusion architectures. Beyond consolidating existing methods, DreamReader introduces three novel intervention primitives for diffusion models: (1) representation fine-tuning (LoReFT) for subspace-constrained internal adaptation; (2) classifier-guided gradient steering using MLP probes trained on activations; and (3) component-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning