Black-Box Attack against GAN-Generated Image Detector with Contrastive Perturbation
Zijie Lou, Gang Cao, Man Lin

TL;DR
This paper introduces a novel black-box attack method using contrastive perturbation to deceive GAN-generated image detectors, significantly reducing their accuracy while maintaining high image quality.
Contribution
It presents a new contrastive learning-based attack approach that effectively fools multiple state-of-the-art GAN image detectors in a black-box setting.
Findings
Attack reduces detector accuracy on six GANs
High visual quality of attacked images maintained
Effective against three state-of-the-art detectors
Abstract
Visually realistic GAN-generated facial images raise obvious concerns on potential misuse. Many effective forensic algorithms have been developed to detect such synthetic images in recent years. It is significant to assess the vulnerability of such forensic detectors against adversarial attacks. In this paper, we propose a new black-box attack method against GAN-generated image detectors. A novel contrastive learning strategy is adopted to train the encoder-decoder network based anti-forensic model under a contrastive loss function. GAN images and their simulated real counterparts are constructed as positive and negative samples, respectively. Leveraging on the trained attack model, imperceptible contrastive perturbation could be applied to input synthetic images for removing GAN fingerprint to some extent. As such, existing GAN-generated image detectors are expected to be deceived.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis
MethodsContrastive Learning
