DualGuard: Dual-stream Large Language Model Watermarking Defense against Paraphrase and Spoofing Attack
Hao Li, Yubing Ren, Yanan Cao, Yingjie Li, Fang Fang, Shi Wang, Li Guo

TL;DR
DualGuard introduces a dual-stream watermarking method for large language models that effectively defends against paraphrase and spoofing attacks, ensuring reliable detection and traceability.
Contribution
It is the first watermarking algorithm capable of defending against both paraphrase and spoofing attacks using an adaptive dual-stream mechanism.
Findings
High detectability and robustness demonstrated across multiple datasets.
Effective traceability of spoofing attacks achieved.
Maintains high text quality while defending against attacks.
Abstract
With the rapid development of cloud-based services, large language models have become increasingly accessible through various web platforms. However, this accessibility has also led to growing risks of model abuse. LLM watermarking has emerged as an effective approach to mitigate such misuse and protect intellectual property. Existing watermarking algorithms, however, primarily focus on defending against paraphrase attacks while overlooking piggyback spoofing attacks, which can inject harmful content, compromise watermark reliability, and undermine trust in attribution. To address this limitation, we propose DualGuard, the first watermarking algorithm capable of defending against both paraphrase and spoofing attacks. DualGuard employs the adaptive dual-stream watermarking mechanism, in which two complementary watermark signals are dynamically injected based on the semantic content. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
