Localizing Knowledge in Diffusion Transformers
Arman Zarei, Samyadeep Basu, Keivan Rezaei, Zihao Lin, Sayan Nag, Soheil Feizi

TL;DR
This paper introduces a method to localize and interpret knowledge within Diffusion Transformer models, enabling targeted model editing for personalization and unlearning with improved efficiency and minimal interference.
Contribution
It presents a novel, model- and knowledge-agnostic approach to localize knowledge in DiT models, facilitating interpretable analysis and efficient targeted updates.
Findings
Localized blocks are interpretable and causally linked to knowledge expression.
The method enables efficient, targeted fine-tuning for personalization and unlearning.
Improves model editing with minimal impact on unrelated content.
Abstract
Understanding how knowledge is distributed across the layers of generative models is crucial for improving interpretability, controllability, and adaptation. While prior work has explored knowledge localization in UNet-based architectures, Diffusion Transformer (DiT)-based models remain underexplored in this context. In this paper, we propose a model- and knowledge-agnostic method to localize where specific types of knowledge are encoded within the DiT blocks. We evaluate our method on state-of-the-art DiT-based models, including PixArt-alpha, FLUX, and SANA, across six diverse knowledge categories. We show that the identified blocks are both interpretable and causally linked to the expression of knowledge in generated outputs. Building on these insights, we apply our localization framework to two key applications: model personalization and knowledge unlearning. In both settings, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Residual Connection · Dense Connections · Softmax · Diffusion · Position-Wise Feed-Forward Layer · Absolute Position Encodings
