Purify Once, Edit Freely: Breaking Image Protections under Model Mismatch
Qichen Zhao, Shengfang Zhai, Xinjian Bai, Qingni Shen, Qiqi Lin, Yansong Gao, Zhonghai Wu

TL;DR
This paper introduces a unified purification framework to evaluate and improve the robustness of image protections against model mismatch in diffusion-based editing, revealing a 'purify-once, edit-freely' failure mode.
Contribution
It proposes two practical purification methods, VAE-Trans and EditorClean, that effectively restore protected images across diverse editing tasks and protection techniques under model mismatch.
Findings
EditorClean improves PSNR by 3-6 dB and reduces FID by 50-70%
Protection signals are largely removed after successful purification
Existing protections often fail under model mismatch, highlighting the need for robust defenses.
Abstract
Diffusion models enable high-fidelity image editing but can also be misused for unauthorized style imitation and harmful content generation. To mitigate these risks, proactive image protection methods embed small, often imperceptible adversarial perturbations into images before sharing to disrupt downstream editing or fine-tuning. However, in realistic post-release scenarios, content owners cannot control downstream processing pipelines, and protections optimized for a surrogate model may fail when attackers use mismatched diffusion pipelines. Existing purification methods can weaken protections but often sacrifice image quality and rarely examine architectural mismatch. We introduce a unified post-release purification framework to evaluate protection survivability under model mismatch. We propose two practical purifiers: VAE-Trans, which corrects protected images via latent-space…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Security and Verification in Computing
