Towards Reliable Verification of Unauthorized Data Usage in Personalized Text-to-Image Diffusion Models
Boheng Li, Yanhao Wei, Yankai Fu, Zhenting Wang, Yiming Li, Jie Zhang,, Run Wang, Tianwei Zhang

TL;DR
This paper introduces SIREN, a novel method for reliably verifying unauthorized data usage in personalized text-to-image diffusion models by embedding and detecting learnable watermarks, addressing limitations of existing coatings.
Contribution
SIREN optimizes data coatings to be learnable by personalized models and employs perceptual constraints and hypothesis testing for robust verification, advancing data traceability in generative AI.
Findings
SIREN significantly improves watermark learnability in personalized models.
The method achieves high detection accuracy across diverse datasets and models.
SIREN remains effective against potential countermeasures.
Abstract
Text-to-image diffusion models are pushing the boundaries of what generative AI can achieve in our lives. Beyond their ability to generate general images, new personalization techniques have been proposed to customize the pre-trained base models for crafting images with specific themes or styles. Such a lightweight solution, enabling AI practitioners and developers to easily build their own personalized models, also poses a new concern regarding whether the personalized models are trained from unauthorized data. A promising solution is to proactively enable data traceability in generative models, where data owners embed external coatings (e.g., image watermarks or backdoor triggers) onto the datasets before releasing. Later the models trained over such datasets will also learn the coatings and unconsciously reproduce them in the generated mimicries, which can be extracted and used as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques
MethodsDiffusion · Balanced Selection · Sparse Evolutionary Training
