Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for   Text-to-Image Synthesis

Songrui Wang; Yubo Zhu; Wei Tong; Sheng Zhong

arXiv:2409.18897·cs.CV·September 30, 2024

Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis

Songrui Wang, Yubo Zhu, Wei Tong, Sheng Zhong

PDF

Open Access

TL;DR

This paper introduces a dataset watermarking framework to detect unauthorized usage and trace data leaks in fine-tuned Stable Diffusion models for text-to-image synthesis, ensuring dataset rights are protected.

Contribution

The paper proposes a novel watermarking framework with multiple schemes that effectively detects dataset abuse and traces leaks with minimal data modification.

Findings

01

High detection accuracy with only 2% data modification

02

Effective for large-scale dataset authorization

03

Robust and transferable watermarking schemes

Abstract

Text-to-image synthesis has become highly popular for generating realistic and stylized images, often requiring fine-tuning generative models with domain-specific datasets for specialized tasks. However, these valuable datasets face risks of unauthorized usage and unapproved sharing, compromising the rights of the owners. In this paper, we address the issue of dataset abuse during the fine-tuning of Stable Diffusion models for text-to-image synthesis. We present a dataset watermarking framework designed to detect unauthorized usage and trace data leaks. The framework employs two key strategies across multiple watermarking schemes and is effective for large-scale dataset authorization. Extensive experiments demonstrate the framework's effectiveness, minimal impact on the dataset (only 2% of the data required to be modified for high detection accuracy), and ability to trace data leaks.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputer Graphics and Visualization Techniques · Image Processing and 3D Reconstruction · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion