Auditing Data Provenance in Real-world Text-to-Image Diffusion Models for Privacy and Copyright Protection

Jie Zhu; Leye Wang

arXiv:2506.11434·cs.CV·June 16, 2025

Auditing Data Provenance in Real-world Text-to-Image Diffusion Models for Privacy and Copyright Protection

Jie Zhu, Leye Wang

PDF

Open Access

TL;DR

This paper introduces a black-box auditing framework called FSCA for verifying data provenance in text-to-image diffusion models, enhancing privacy and copyright protection without needing internal model access.

Contribution

The paper presents a novel black-box auditing method that leverages semantic connections within diffusion models, outperforming existing approaches in real-world scenarios.

Findings

01

FSCA surpasses state-of-the-art baselines across various metrics.

02

Achieves up to 90% user-level accuracy with only 10 samples per user.

03

Effective in real-world datasets like LAION-mi and COCO.

Abstract

Text-to-image diffusion model since its propose has significantly influenced the content creation due to its impressive generation capability. However, this capability depends on large-scale text-image datasets gathered from web platforms like social media, posing substantial challenges in copyright compliance and personal privacy leakage. Though there are some efforts devoted to explore approaches for auditing data provenance in text-to-image diffusion models, existing work has unrealistic assumptions that can obtain model internal knowledge, e.g., intermediate results, or the evaluation is not reliable. To fill this gap, we propose a completely black-box auditing framework called Feature Semantic Consistency-based Auditing (FSCA). It utilizes two types of semantic connections within the text-to-image diffusion model for auditing, eliminating the need for access to internal knowledge.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Research Data Management Practices