EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations
Jie Ren, Yingqian Cui, Chen Chen, Yue Xing, Hui Liu, Lingjuan Lyu

TL;DR
EnTruth introduces a novel method using template memorization to improve the traceability of unauthorized dataset usage in text-to-image diffusion models, achieving high accuracy and robustness with minimal alterations.
Contribution
This work is the first to leverage memorization for copyright protection in generative models, providing a new approach for detecting unauthorized dataset usage.
Findings
Effective detection with low data-alteration rate
High robustness against model variations
Maintains high image generation quality
Abstract
Generative models, especially text-to-image diffusion models, have significantly advanced in their ability to generate images, benefiting from enhanced architectures, increased computational power, and large-scale datasets. While the datasets play an important role, their protection has remained as an unsolved issue. Current protection strategies, such as watermarks and membership inference, are either in high poison rate which is detrimental to image quality or suffer from low accuracy and robustness. In this work, we introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage utilizing template memorization. By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement. Our method is the first to investigate the positive application of memorization and use it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical and Computational Modeling
