On the Information-Theoretic Fragility of Robust Watermarking under Diffusion Editing
Yunyi Ni, Ziyu Yang, Ze Niu, Emily Davis, Finn Carter

TL;DR
This paper reveals that diffusion-based image editing can effectively eliminate robust watermarks by reducing mutual information to zero, posing a significant challenge for watermarking in the era of generative AI.
Contribution
It provides a theoretical analysis of watermark fragility under diffusion processes and introduces a guided diffusion attack to erase watermarks from images.
Findings
Diffusion transformations reduce watermark mutual information to near zero.
The guided diffusion attack effectively removes watermarks while preserving image quality.
Watermark recovery rates drop to near zero after diffusion-based attacks.
Abstract
Robust invisible watermarking embeds hidden information in images such that the watermark can survive various manipulations. However, the emergence of powerful diffusion-based image generation and editing techniques poses a new threat to these watermarking schemes. In this paper, we investigate the intersection of diffusion-based image editing and robust image watermarking. We analyze how diffusion-driven image edits can significantly degrade or even fully remove embedded watermarks from state-of-the-art robust watermarking systems. Both theoretical formulations and empirical experiments are provided. We prove that as a image undergoes iterative diffusion transformations, the mutual information between the watermarked image and the embedded payload approaches zero, causing watermark decoding to fail. We further propose a guided diffusion attack algorithm that explicitly targets and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Digital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis
