Watermarks Attack Watermarks: Re-Watermarking as a Generic Removal Strategy
Maria Bulychev, Neil G. Marchant, Benjamin I. P. Rubinstein

TL;DR
This paper demonstrates that re-watermarking an already watermarked image can reliably remove the original watermark, exposing a security vulnerability in current watermarking schemes.
Contribution
It introduces a simple yet powerful re-watermarking attack method and a classifier for watermark detection, challenging the robustness of existing watermarking techniques.
Findings
Re-watermarking suppresses original watermark signals by 25-48%.
A classifier achieves 87.8-95.3% accuracy in detecting watermarks.
Re-watermarking does not require gradients or detection keys.
Abstract
Watermarking combines an imperceptible change to an input image that will trigger a detector, to assert provenance and protect intellectual property. The literature has shown great interest in attacks on watermarking schemes: attackers are clearly motivated to steal copyrighted material or circumvent legislated deepfake protections. In this work, we make a simple-yet-powerful observation: that such attacks on watermarking-like watermarks themselves-seek an imperceptible change to an input image (now already watermarked) that will trigger a detector. This analogy comparing watermark attacks to watermarking itself is highly suggestive: that watermarks could be used to attack watermarks. Our first contribution validates this hypothesis. In rigorous experiments spanning 96 combinations of dataset, victim, and attack watermarks, we show that simply re-watermarking an already watermarked…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
