Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models
Jinxu Lin, Linwei Tao, Minjing Dong, Chang Xu

TL;DR
This paper introduces the Diffusion Attribution Score (DAS), a novel method for accurately measuring the influence of individual training samples on diffusion models, addressing limitations of previous approaches and demonstrating superior performance.
Contribution
The paper proposes DAS, a new data attribution method for diffusion models that directly compares predicted distributions, supported by theoretical analysis and optimized for large-scale applications.
Findings
DAS outperforms previous benchmarks in data-modelling scores.
Theoretical analysis confirms DAS's effectiveness.
Accelerated strategies enable DAS application to large diffusion models.
Abstract
As diffusion models become increasingly popular, the misuse of copyrighted and private images has emerged as a major concern. One promising solution to mitigate this issue is identifying the contribution of specific training samples in generative models, a process known as data attribution. Existing data attribution methods for diffusion models typically quantify the contribution of a training sample by evaluating the change in diffusion loss when the sample is included or excluded from the training process. However, we argue that the direct usage of diffusion loss cannot represent such a contribution accurately due to the calculation of diffusion loss. Specifically, these approaches measure the divergence between predicted and ground truth distributions, which leads to an indirect comparison between the predicted distributions and cannot represent the variances between model behaviors.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTechnology and Data Analysis
MethodsDiffusion
