RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees
Xun Xian, Ganghua Wang, Xuan Bi, Jayanth Srinivasa, Ashish Kundu,, Mingyi Hong, and Jie Ding

TL;DR
This paper presents RAW, a flexible watermark detection framework for AI-generated images that uses learnable watermarks and provable guarantees to improve robustness against adversarial attacks.
Contribution
Introduces RAW, a novel plug-and-play watermarking framework with learnable watermarks and provable false positive guarantees, compatible with various generative models.
Findings
Significant increase in AUROC from 0.48 to 0.82 under adversarial attacks.
Maintains image quality with comparable FID and CLIP scores.
Supports on-the-fly watermark injection after training.
Abstract
Safeguarding intellectual property and preventing potential misuse of AI-generated images are of paramount importance. This paper introduces a robust and agile plug-and-play watermark detection framework, dubbed as RAW. As a departure from traditional encoder-decoder methods, which incorporate fixed binary codes as watermarks within latent representations, our approach introduces learnable watermarks directly into the original image data. Subsequently, we employ a classifier that is jointly trained with the watermark to detect the presence of the watermark. The proposed framework is compatible with various generative architectures and supports on-the-fly watermark injection after training. By incorporating state-of-the-art smoothing techniques, we show that the framework provides provable guarantees regarding the false positive rate for misclassifying a watermarked image, even in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Vehicle License Plate Recognition · Video Surveillance and Tracking Methods
MethodsDiffusion · Contrastive Language-Image Pre-training
