Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks

Huanming Shen; Baizhou Huang; Xiaojun Wan

arXiv:2507.06274·cs.CR·December 9, 2025

Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks

Huanming Shen, Baizhou Huang, Xiaojun Wan

PDF

Open Access

TL;DR

This paper introduces SEEK, a novel watermarking scheme for LLMs that significantly improves resilience against both scrubbing and spoofing attacks by leveraging equivalent texture keys and redundancy.

Contribution

It presents a new watermarking mechanism that breaks the traditional trade-off, achieving better robustness against both attack types without sacrificing performance.

Findings

01

SEEK outperforms prior methods in robustness metrics.

02

Spoofing robustness improved by over 88%.

03

Scrubbing robustness increased by up to 24.6%.

Abstract

Watermarking is a promising defense against the misuse of large language models (LLMs), yet it remains vulnerable to scrubbing and spoofing attacks. This vulnerability stems from an inherent trade-off governed by watermark window size: smaller windows resist scrubbing better but are easier to reverse-engineer, enabling low-cost statistics-based spoofing attacks. This work breaks this trade-off by introducing a novel mechanism, equivalent texture keys, where multiple tokens within a watermark window can independently support the detection. Based on the redundancy, we propose a novel watermark scheme with Sub-vocabulary decomposed Equivalent tExture Key (SEEK). It achieves a Pareto improvement, increasing the resilience against scrubbing attacks without compromising robustness to spoofing. Experiments demonstrate SEEK's superiority over prior method, yielding spoofing robustness gains of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security · Advanced Malware Detection Techniques