EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations

Jie Ren; Yingqian Cui; Chen Chen; Yue Xing; Hui Liu; Lingjuan Lyu

arXiv:2406.13933·cs.CR·November 27, 2025

EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations

Jie Ren, Yingqian Cui, Chen Chen, Yue Xing, Hui Liu, Lingjuan Lyu

PDF

Open Access

TL;DR

EnTruth introduces a novel method using template memorization to improve the traceability of unauthorized dataset usage in text-to-image diffusion models, achieving high accuracy and robustness with minimal alterations.

Contribution

This work is the first to leverage memorization for copyright protection in generative models, providing a new approach for detecting unauthorized dataset usage.

Findings

01

Effective detection with low data-alteration rate

02

High robustness against model variations

03

Maintains high image generation quality

Abstract

Generative models, especially text-to-image diffusion models, have significantly advanced in their ability to generate images, benefiting from enhanced architectures, increased computational power, and large-scale datasets. While the datasets play an important role, their protection has remained as an unsolved issue. Current protection strategies, such as watermarks and membership inference, are either in high poison rate which is detrimental to image quality or suffer from low accuracy and robustness. In this work, we introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage utilizing template memorization. By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement. Our method is the first to investigate the positive application of memorization and use it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical and Computational Modeling