Leveraging Model Guidance to Extract Training Data from Personalized Diffusion Models
Xiaoyu Wu, Jiaru Zhang, Zhiwei Steven Wu

TL;DR
This paper introduces FineXtract, a method to extract training data from personalized diffusion models, revealing potential data leakage and copyright infringement risks associated with fine-tuned models shared online.
Contribution
We propose a novel framework that approximates fine-tuning as a distribution shift and guides image generation to extract training data from diffusion models.
Findings
Extracted about 20% of fine-tuning data in experiments
Validated on datasets like WikiArt and DreamBooth
Effective in real-world online checkpoints
Abstract
Diffusion Models (DMs) have become powerful image generation tools, especially for few-shot fine-tuning where a pretrained DM is fine-tuned on a small image set to capture specific styles or objects. Many people upload these personalized checkpoints online, fostering communities such as Civitai and HuggingFace. However, model owners may overlook the data leakage risks when releasing fine-tuned checkpoints. Moreover, concerns regarding copyright violations arise when unauthorized data is used during fine-tuning. In this paper, we ask: "Can training data be extracted from these fine-tuned DMs shared online?" A successful extraction would present not only data leakage threats but also offer tangible evidence of copyright infringement. To answer this, we propose FineXtract, a framework for extracting fine-tuning data. Our method approximates fine-tuning as a gradual shift in the model's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStatistical Methods and Inference
MethodsSparse Evolutionary Training
