Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection
Zheng Gao, Xiaoyu Li, Zhicheng Bao, Xiaoyan Feng, Jiaojiao Jiang

TL;DR
This paper reveals a new vulnerability in semantic watermarking for images, showing that large language models can craft semantic alterations that bypass detection, exposing security weaknesses in current methods.
Contribution
The paper introduces the CSI attack, leveraging LLMs to perform coherence-preserving semantic injections that undermine existing semantic watermarking defenses.
Findings
CSI outperforms existing attacks against semantic watermarking
LLM-guided semantic manipulation can bypass current watermark detectors
Semantic watermarking schemes are vulnerable to LLM-driven semantic perturbations
Abstract
Generative images have proliferated on Web platforms in social media and online copyright distribution scenarios, and semantic watermarking has increasingly been integrated into diffusion models to support reliable provenance tracking and forgery prevention for web content. Traditional noise-layer-based watermarking, however, remains vulnerable to inversion attacks that can recover embedded signals. To mitigate this, recent content-aware semantic watermarking schemes bind watermark signals to high-level image semantics, constraining local edits that would otherwise disrupt global coherence. Yet, large language models (LLMs) possess structured reasoning capabilities that enable targeted exploration of semantic spaces, allowing locally fine-grained but globally coherent semantic alterations that invalidate such bindings. To expose this overlooked vulnerability, we introduce a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Steganography and Watermarking Techniques · Generative Adversarial Networks and Image Synthesis
