Is It Possible to Backdoor Face Forgery Detection with Natural Triggers?

Xiaoxuan Han; Songlin Yang; Wei Wang; Ziwen He; Jing Dong

arXiv:2401.00414·cs.CV·January 2, 2024·1 cites

Is It Possible to Backdoor Face Forgery Detection with Natural Triggers?

Xiaoxuan Han, Songlin Yang, Wei Wang, Ziwen He, Jing Dong

PDF

Open Access

TL;DR

This paper introduces a novel natural trigger backdoor attack on face forgery detection models, demonstrating high attack success and robustness while being less detectable to humans, highlighting new security challenges.

Contribution

It proposes a new analysis-by-synthesis backdoor attack embedding natural triggers in face forgery detection models, evaluated with state-of-the-art generative models and comprehensive experiments.

Findings

01

Achieves over 99% attack success rate with minimal accuracy drop

02

Outperforms existing defenses against backdoor attacks

03

Less detectable to humans in user studies

Abstract

Deep neural networks have significantly improved the performance of face forgery detection models in discriminating Artificial Intelligent Generated Content (AIGC). However, their security is significantly threatened by the injection of triggers during model training (i.e., backdoor attacks). Although existing backdoor defenses and manual data selection can mitigate those using human-eye-sensitive triggers, such as patches or adversarial noises, the more challenging natural backdoor triggers remain insufficiently researched. To further investigate natural triggers, we propose a novel analysis-by-synthesis backdoor attack against face forgery detection models, which embeds natural triggers in the latent space. We thoroughly study such backdoor vulnerability from two perspectives: (1) Model Discrimination (Optimization-Based Trigger): we adopt a substitute detection model and find the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis

MethodsHuMan(Expedia)||How do I get a human at Expedia? · Adaptive Instance Normalization · R1 Regularization · Convolution · Dense Connections · Diffusion · Feedforward Network · StyleGAN