Influence Functions for Scalable Data Attribution in Diffusion Models
Bruno Mlodozeniec, Runa Eschenhagen, Juhan Bae, Alexander Immer, David Krueger, Richard Turner

TL;DR
This paper introduces a scalable influence functions framework for data attribution in diffusion models, enabling better interpretability and understanding of how training data affects generated outputs.
Contribution
It develops a novel influence functions approach tailored for diffusion models, including scalable Hessian approximations, and unifies previous methods within this framework.
Findings
Our method outperforms previous data attribution techniques.
Scalable Hessian approximations enable efficient influence computations.
The framework improves interpretability of diffusion models.
Abstract
Diffusion models have led to significant advancements in generative modelling. Yet their widespread adoption poses challenges regarding data attribution and interpretability. In this paper, we aim to help address such challenges in diffusion models by developing an influence functions framework. Influence function-based data attribution methods approximate how a model's output would have changed if some training data were removed. In supervised learning, this is usually used for predicting how the loss on a particular example would change. For diffusion models, we focus on predicting the change in the probability of generating a particular example via several proxy measurements. We show how to formulate influence functions for such quantities and how previously proposed methods can be interpreted as particular design choices in our framework. To ensure scalability of the Hessian…
Peer Reviews
Decision·ICLR 2025 Oral
- A theoretical framework is shown to unify influence functions, Journey-TRAK, and D-TRAK for diffusion models. This provides some clarity to the field of data attribution for diffusion models, which currently seems rather empirical. - The theoretical framework motivates the design choices for approximating influence functions, which actually lead to better performance (as shown in Figure 4). - Empirical observations (Section 4.1) are made to better understand how to apply influence functions to
- The LDS and counterfactual evaluations are limited to DDPM on CIFAR. The paper would be strengthened if K-FAC Influence is also evaluated on LDM and LoRA-fine-tuned Stable Diffusion (as in the Journey-TRAK and D-TRAK papers). Furthermore, since the observations in Section 4.1 are empirical, not theoretical, they need to be validated in other model architectures and datasets to make the claims more general. - The proposed method K-FAC Influence is also sensitive to the choice of the damping par
This paper combines interesting ideas to tractably approximate influence functions for diffusion models, and achieve state-of-the-art performance on training data attribution on diffusion models, which is a challenging problem.
While it is clear that the proposed method is much more scalable than computing the raw influence function, it is difficult to assess how scalable this method actually is based on the main body of this paper (e.g. things like what is the runtime, how does it scale with the model size, what are the space requirements). A reader ought to have a good idea of how feasible this would be to implement for themselves. It is not clear whether the proposed method outperforms the existing methods given th
This work is very valuable for anyone interested in IF estimation in generative diffusion models. This is likely going to be a very important tool due to privacy and copyright concerns associated with image and video generation. Both the theoretical discussion and the experimental analysis is systematic and well executed. The paper is very easy to follow and it manages to both provide the intuition and the theoretical justification for its design choices. I particularly appreciated the discus
Arguably, the marginal likelihood logp(x) is the ideal target for a IFa analysis. While it is somewhat challenging to compute using the deterministic integration formula with stochastic or deterministic trace estimation, it is not intractable and in my opinion it should have been analyzed. For example, having the analysis of the IF of logp(x) would help to make sense of the counterintuitive behavior of the IF of the ELBO. It would have also appreciated to see an analysis on more tractable low-
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpinion Dynamics and Social Influence · Neural Networks and Applications
MethodsFocus · Diffusion
