DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing

June Suk Choi; Kyungmin Lee; Jongheon Jeong; Saining Xie; Jinwoo Shin; Kimin Lee

arXiv:2410.05694·cs.CV·September 30, 2025

DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing

June Suk Choi, Kyungmin Lee, Jongheon Jeong, Saining Xie, Jinwoo Shin, Kimin Lee

PDF

Open Access 1 Repo 3 Reviews

TL;DR

DiffusionGuard is a novel defense method that enhances robustness against malicious diffusion-based image editing, effectively protecting images from unauthorized edits even with complex masks and in realistic scenarios.

Contribution

The paper introduces a new adversarial noise generation objective and a mask-augmentation technique, significantly improving defense against diffusion-based image manipulation.

Findings

01

Outperforms baseline methods in robustness and efficiency.

02

Shows superior transferability and noise removal resilience.

03

Achieves stronger protection with lower computational costs.

Abstract

Recent advances in diffusion models have introduced a new era of text-guided image manipulation, enabling users to create realistic edited images with simple textual prompts. However, there is significant concern about the potential misuse of these methods, especially in creating misleading or harmful content. Although recent defense strategies, which introduce imperceptible adversarial noise to induce model failure, have shown promise, they remain ineffective against more sophisticated manipulations, such as editing with a mask. In this work, we propose DiffusionGuard, a robust and effective defense method against unauthorized edits by diffusion-based image editing models, even in challenging setups. Through a detailed analysis of these models, we introduce a novel objective that generates adversarial noise targeting the early stage of the diffusion process. This approach significantly…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 2

Strengths

- The proposal of mask augmentation is sensible. - DiffusionGuard outperforms the baselines in all metrics. Qualitative figures show that it often causes the inpainting models to generate plain and blurry inpainted output backgrounds.

Weaknesses

- The title is misleading. The work only focuses on diffusion-based text-guided inpainting, e.g., Stable Diffusion Inpainting. It does not consider other diffusion-based image editing methods such as Instruct-Pix2Pix, MasaCtrl... The authors should revise the title to better specify the scope of the work. - The work only tests with Stable Diffusion Inpainting variants. Recent inpainting models, e.g., MagicBrush [1], should be mentioned and tested. - L191-200: the mentioned "unique behavior" of i

Reviewer 02Rating 6Confidence 4

Strengths

1. The observations, that the inpainting models produce fine details of masked region at early steps, are interesting and insightful. 2. Using augmented masks is a reasonable and effective method to improve robustness. 3. The paper proposes a benchmark to evaluate different methods. Extensive results show the effectiveness and robustness of the method.

Weaknesses

There are two main concerns. 1. Did the authors try some specifically designed purification methods for such perturbations in diffusion models? Such as the method in [1]. 2. Only focusing on mask-based image editing may be a little limited. Currently many editing methods do not require such masks, such as InstructPix2Pix[2]. Can the proposed method be used in these methods? Will the proposed method still be more effective and robust? [1] Bochuan Cao et al. IMPRESS: Evaluating the Resilienc

Reviewer 03Rating 6Confidence 4

Strengths

1. An insight is provided that inpainting models generate fine details during the very early stages of the denoising process. 2. A new objective specifically designed to prevent image inpainting is introduced. 3. A benchmark is introduced.

Weaknesses

1. The mask augmentation is achieved by shrinking the contours inward. If malicious users provide masks larger than those used during training, will this affect performance? 2. The diffusion model's sampling can begin from different timesteps, and various sampling schedulers may start at different timesteps. For example, when sampling with 50 steps of DDIM, T is typically around 981, whereas for 25 steps of DPM-Solver, T might be around 961. If the user uses a different sampler from the one use

Code & Models

Repositories

choi403/diffusionguard
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Face recognition and analysis · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion