PromptLA: Towards Integrity Verification of Black-box Text-to-Image Diffusion Models
Zhuomeng Zhang, Fangqi Li, Chong Di, Hongyu Zhu, Hanyi Wang, Shilin, Wang

TL;DR
This paper introduces PromptLA, a novel method for verifying the integrity of black-box text-to-image diffusion models by analyzing feature distributions, achieving high accuracy and robustness while addressing regulatory and legal concerns.
Contribution
The paper presents the first approach for integrity verification of T2I diffusion models using a prompt selection algorithm and distribution analysis, improving detection accuracy and efficiency.
Findings
Achieves over 0.96 AUC in integrity detection
Outperforms baseline methods by more than 0.2 in AUC
Robust against image-level post-processing
Abstract
Despite the impressive synthesis quality of text-to-image (T2I) diffusion models, their black-box deployment poses significant regulatory challenges: Malicious actors can fine-tune these models to generate illegal content, circumventing existing safeguards through parameter manipulation. Therefore, it is essential to verify the integrity of T2I diffusion models. To this end, considering the randomness within the outputs of generative models and the high costs in interacting with them, we discern model tampering via the KL divergence between the distributions of the features of generated images. We propose a novel prompt selection algorithm based on learning automaton (PromptLA) for efficient and accurate verification. Evaluations on four advanced T2I models (e.g., SDXL, FLUX.1) demonstrate that our method achieves a mean AUC of over 0.96 in integrity detection, exceeding baselines by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Computer Graphics and Visualization Techniques
MethodsDiffusion
