Are Watermarks Bugs for Deepfake Detectors? Rethinking Proactive Forensics
Xiaoshuai Wu, Xin Liao, Bo Ou, Yuling Liu, Zheng Qin

TL;DR
This paper introduces AdvMark, a method that fine-tunes robust watermarks to exploit detector vulnerabilities, improving Deepfake detection accuracy while maintaining provenance tracking.
Contribution
It proposes a novel adversarial watermarking technique that enhances forensic detectability and robustness against Deepfake detectors without retraining the detectors.
Findings
AdvMark effectively fools Deepfake detectors in experiments.
Watermarked images with AdvMark improve downstream detection accuracy.
The method maintains the ability to extract watermarks for provenance.
Abstract
AI-generated content has accelerated the topic of media synthesis, particularly Deepfake, which can manipulate our portraits for positive or malicious purposes. Before releasing these threatening face images, one promising forensics solution is the injection of robust watermarks to track their own provenance. However, we argue that current watermarking models, originally devised for genuine images, may harm the deployed Deepfake detectors when directly applied to forged images, since the watermarks are prone to overlap with the forgery signals used for detection. To bridge this gap, we thus propose AdvMark, on behalf of proactive forensics, to exploit the adversarial vulnerability of passive detectors for good. Specifically, AdvMark serves as a plug-and-play procedure for fine-tuning any robust watermarking into adversarial watermarking, to enhance the forensic detectability of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Digital and Cyber Forensics · Advanced Malware Detection Techniques
