Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection

Zheng Gao; Xiaoyu Li; Zhicheng Bao; Xiaoyan Feng; Jiaojiao Jiang

arXiv:2602.21593·cs.LG·February 26, 2026

Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection

Zheng Gao, Xiaoyu Li, Zhicheng Bao, Xiaoyan Feng, Jiaojiao Jiang

PDF

Open Access

TL;DR

This paper reveals a new vulnerability in semantic watermarking for images, showing that large language models can craft semantic alterations that bypass detection, exposing security weaknesses in current methods.

Contribution

The paper introduces the CSI attack, leveraging LLMs to perform coherence-preserving semantic injections that undermine existing semantic watermarking defenses.

Findings

01

CSI outperforms existing attacks against semantic watermarking

02

LLM-guided semantic manipulation can bypass current watermark detectors

03

Semantic watermarking schemes are vulnerable to LLM-driven semantic perturbations

Abstract

Generative images have proliferated on Web platforms in social media and online copyright distribution scenarios, and semantic watermarking has increasingly been integrated into diffusion models to support reliable provenance tracking and forgery prevention for web content. Traditional noise-layer-based watermarking, however, remains vulnerable to inversion attacks that can recover embedded signals. To mitigate this, recent content-aware semantic watermarking schemes bind watermark signals to high-level image semantics, constraining local edits that would otherwise disrupt global coherence. Yet, large language models (LLMs) possess structured reasoning capabilities that enable targeted exploration of semantic spaces, allowing locally fine-grained but globally coherent semantic alterations that invalidate such bindings. To expose this overlooked vulnerability, we introduce a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Steganography and Watermarking Techniques · Generative Adversarial Networks and Image Synthesis