PersGuard: Preventing Malicious Personalization via Backdoor Attacks on   Pre-trained Text-to-Image Diffusion Models

Xinwei Liu; Xiaojun Jia; Yuan Xun; Hua Zhang; Xiaochun Cao

arXiv:2502.16167·cs.CV·February 25, 2025

PersGuard: Preventing Malicious Personalization via Backdoor Attacks on Pre-trained Text-to-Image Diffusion Models

Xinwei Liu, Xiaojun Jia, Yuan Xun, Hua Zhang, Xiaochun Cao

PDF

Open Access

TL;DR

PersGuard is a backdoor-based method that prevents malicious personalization in text-to-image diffusion models, enhancing privacy and copyright protection by implanting triggers that block specific image generation without affecting normal use.

Contribution

The paper introduces a novel backdoor approach with specialized objectives and retention loss to prevent malicious personalization in diffusion models, overcoming limitations of existing adversarial methods.

Findings

01

PersGuard effectively blocks personalized image generation for protected objects.

02

It maintains robustness against downstream fine-tuning and data transformations.

03

Outperforms existing privacy-preserving techniques in various challenging scenarios.

Abstract

Diffusion models (DMs) have revolutionized data generation, particularly in text-to-image (T2I) synthesis. However, the widespread use of personalized generative models raises significant concerns regarding privacy violations and copyright infringement. To address these issues, researchers have proposed adversarial perturbation-based protection techniques. However, these methods have notable limitations, including insufficient robustness against data transformations and the inability to fully eliminate identifiable features of protected objects in the generated output. In this paper, we introduce PersGuard, a novel backdoor-based approach that prevents malicious personalization of specific images. Unlike traditional adversarial perturbation methods, PersGuard implant backdoor triggers into pre-trained T2I models, preventing the generation of customized outputs for designated protected…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Mental Health via Writing

MethodsDiffusion