Diffusion-Guided Adversarial Perturbation Injection for Generalizable Defense Against Facial Manipulations

Yue Li; Linying Xue; Kaiqing Lin; Hanyu Quan; Dongdong Lin; Hui Tian; Hongxia Wang; Bin Wang

arXiv:2604.01635·cs.CR·April 3, 2026

Diffusion-Guided Adversarial Perturbation Injection for Generalizable Defense Against Facial Manipulations

Yue Li, Linying Xue, Kaiqing Lin, Hanyu Quan, Dongdong Lin, Hui Tian, Hongxia Wang, Bin Wang

PDF

TL;DR

AEGIS introduces a diffusion-guided adversarial perturbation method injected into latent space to enhance facial deepfake defense, overcoming pixel-level constraints and supporting both white-box and black-box scenarios.

Contribution

This work presents the first diffusion-guided paradigm for adversarial facial image generation aimed at identity shielding, improving effectiveness and transferability over existing methods.

Findings

01

AEGIS achieves robust disruption of deepfake manipulations in white-box settings.

02

It demonstrates strong transferability and effectiveness in black-box scenarios.

03

The method maintains high perceptual quality of images.

Abstract

Recent advances in GAN and diffusion models have significantly improved the realism and controllability of facial deepfake manipulation, raising serious concerns regarding privacy, security, and identity misuse. Proactive defenses attempt to counter this threat by injecting adversarial perturbations into images before manipulation takes place. However, existing approaches remain limited in effectiveness due to suboptimal perturbation injection strategies and are typically designed under white-box assumptions, targeting only simple GAN-based attribute editing. These constraints hinder their applicability in practical real-world scenarios. In this paper, we propose AEGIS, the first diffusion-guided paradigm in which the AdvErsarial facial images are Generated for Identity Shielding. We observe that the limited defense capability of existing approaches stems from the peak-clipping…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.