Mechanistic Interpretability of Diffusion Models: Circuit-Level Analysis and Causal Validation
Dip Roy

TL;DR
This paper provides a detailed circuit-level analysis of diffusion models, revealing how they process images differently for synthetic and real data, and identifies key attention mechanisms with causal validation.
Contribution
It introduces a systematic, quantitative approach to understanding diffusion models at the circuit level, including causal validation of specific computational pathways.
Findings
Real-world face processing circuits have higher computational complexity.
Eight distinct attention mechanisms with specialized roles were identified.
Targeted interventions cause significant performance degradation, confirming causal circuit functions.
Abstract
We present a quantitative circuit-level analysis of diffusion models, establishing computational pathways and mechanistic principles underlying image generation processes. Through systematic intervention experiments across 2,000 synthetic and 2,000 CelebA facial images, we discover fundamental algorithmic differences in how diffusion architectures process synthetic versus naturalistic data distributions. Our investigation reveals that real-world face processing requires circuits with measurably higher computational complexity (complexity ratio = 1.084 plus/minus 0.008, p < 0.001), exhibiting distinct attention specialization patterns with entropy divergence ranging from 0.015 to 0.166 across denoising timesteps. We identify eight functionally distinct attention mechanisms showing specialized computational roles: edge detection (entropy = 3.18 plus/minus 0.12), texture analysis (entropy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
