Localizing and Editing Knowledge in Text-to-Image Generative Models
Samyadeep Basu, Nanxuan Zhao, Vlad Morariu, Soheil Feizi, Varun, Manjunatha

TL;DR
This paper investigates how visual attribute knowledge is stored in text-to-image diffusion models, revealing distributed representations and a single causal state in the CLIP text-encoder, enabling rapid concept editing.
Contribution
It adapts causal mediation analysis to locate knowledge in diffusion models, uncovers the distribution of attribute information, and introduces Diff-QuickFix for fast, data-free model editing.
Findings
Knowledge about visual attributes is distributed across multiple components in the UNet.
The CLIP text-encoder contains only one causal state, at the last attention layer of the subject token.
Diff-QuickFix achieves concept editing in under a second with comparable performance.
Abstract
Text-to-Image Diffusion Models such as Stable-Diffusion and Imagen have achieved unprecedented quality of photorealism with state-of-the-art FID scores on MS-COCO and other generation benchmarks. Given a caption, image generation requires fine-grained knowledge about attributes such as object structure, style, and viewpoint amongst others. Where does this information reside in text-to-image generative models? In our paper, we tackle this question and understand how knowledge corresponding to distinct visual attributes is stored in large-scale text-to-image diffusion models. We adapt Causal Mediation Analysis for text-to-image models and trace knowledge about distinct visual attributes to various (causal) components in the (i) UNet and (ii) text-encoder of the diffusion model. In particular, we show that unlike generative large-language models, knowledge about different attributes is not…
Peer Reviews
Decision·ICLR 2024 poster
The proposed method for locating and editing knowledge in diffusion models is a novel approach. The editing method introduced in this paper is unique and significantly faster than traditional training-based techniques. As demonstrated by the results presented in the appendix, the method effectively removes or modifies unwanted concepts introduced by the diffusion model. The authors clearly convey the paper's main idea, with appropriate background information at most places. The extensive resul
The scope of the claims seems too broad. The title and introduction claim to locate the knowledge of text-to-image diffusion models, while in the paper, only one stable diffusion model checkpoint is investigated. Given that most of the findings on this model are through laborious experiments, it is unclear if these findings can be generalized to even other versions of stable diffusion models, not to mention other types of text-to-image models. If the findings are only applicable to the studied c
The use of causal mediation analysis is a good idea, and seems to provide a basis for knowledge tracing, but the text should be clearer about how the approach actually uses or follows the CMA paradigm. It is mentioned in the introductory sections but not later on. The knowledge analysis provides significant insight into multi-modal models, showing they seem to store knowledge differently from language-only models. Knowledge localization analysis is used effectively to enable highly efficient
The intro is unclear about key points, such as what forms of visual knowledge the paper is focused on, because it is unclear what “visual attribute” means in this paper. In computer vison, an attribute is usually a property of an object such as its color, texture, gender (for a person), presence of accessories (eyeglasses, hats, etc.), age, and so on. It seems that visual attribute here means any sort of visual information, which is confusing. Fig. 2, which is very effective and interesting, sh
* The paper adapts Causal Mediation Analysis to interpret text-to-image diffusion models and draw several meaningful conclusions in terms of the location of visual knowledge within these models. * The paper proposes an editing method based on the observations as an application that achieves empirical advantages compared to prior methods.
* The main toolbox of the interpretation method, Causal Mediation Analysis, is borrowed from previous works. There is a limited novelty in terms of the interpretation framework. * The experiments presented in the paper all use Stable-Diffusion. The results would be more convincing if other classes of diffusion models could be investigated, which would provide important cues on whether the observations are specific to the Stable-Diffusion architecture or can be transferred to other models that a
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications
MethodsSparse Evolutionary Training · Diffusion · Contrastive Language-Image Pre-training
