Few Channels Draw The Whole Picture: Revealing Massive Activations in Diffusion Transformers

Evelyn Turri; Davide Bucciarelli; Sara Sarto; Lorenzo Baraldi; Marcella Cornia

arXiv:2605.13974·cs.CV·May 15, 2026

Few Channels Draw The Whole Picture: Revealing Massive Activations in Diffusion Transformers

Evelyn Turri, Davide Bucciarelli, Sara Sarto, Lorenzo Baraldi, Marcella Cornia

PDF

TL;DR

This paper reveals that a small subset of massive activations in diffusion transformers critically influence image generation, organize spatial semantics, and can be transferred across prompts for controllable image synthesis.

Contribution

It uncovers the functional, spatial, and transfer properties of massive activations, highlighting their role as a sparse semantic carrier in diffusion transformer models.

Findings

01

Massive activations are functionally critical for image quality.

02

They are spatially organized and align with main image subjects.

03

Transferring massive activations enables prompt interpolation and subject-driven generation.

Abstract

Diffusion Transformers (DiTs) and related flow-based architectures are now among the strongest text-to-image generators, yet the internal mechanisms through which prompts shape image semantics remain poorly understood. In this work, we study massive activations: a small subset of hidden-state channels whose responses are consistently much larger than the rest. We show that, despite their sparsity, these few channels effectively draw the whole picture, in three complementary senses. First, they are functionally critical: a controlled disruption probe that zeroes the massive channels causes a sharp collapse in generation quality, while disrupting an equally-sized set of low-statistic channels has marginal effect. Second, they are spatially organized: restricting image-stream tokens to massive channels and clustering them yields coherent partitions that closely align with the main subject…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.