Watermark under Fire: A Robustness Evaluation of LLM Watermarking

Jiacheng Liang; Zian Wang; Lauren Hong; Shouling Ji; Ting Wang

arXiv:2411.13425·cs.CR·October 1, 2025

Watermark under Fire: A Robustness Evaluation of LLM Watermarking

Jiacheng Liang, Zian Wang, Lauren Hong, Shouling Ji, Ting Wang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces WaterPark, a unified platform for evaluating the robustness of LLM watermarking methods against attacks, providing comprehensive insights into their strengths, limitations, and optimal usage in adversarial settings.

Contribution

It systematically analyzes existing watermarking techniques and develops WaterPark, enabling standardized evaluation and revealing factors affecting robustness.

Findings

01

Watermarking design choices significantly impact attack robustness.

02

WaterPark effectively benchmarks 10 watermarking methods and 12 attacks.

03

Best practices for operating watermarkers in adversarial environments are identified.

Abstract

Various watermarking methods (``watermarkers'') have been proposed to identify LLM-generated texts; yet, due to the lack of unified evaluation platforms, many critical questions remain under-explored: i) What are the strengths/limitations of various watermarkers, especially their attack robustness? ii) How do various design choices impact their robustness? iii) How to optimally operate watermarkers in adversarial environments? To fill this gap, we systematize existing LLM watermarkers and watermark removal attacks, mapping out their design spaces. We then develop WaterPark, a unified platform that integrates 10 state-of-the-art watermarkers and 12 representative attacks. More importantly, by leveraging WaterPark, we conduct a comprehensive assessment of existing watermarkers, unveiling the impact of various design choices on their attack robustness. We further explore the best practices…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JACKPURCELL/sok-llm-watermark
pytorchOfficial

Videos

Watermark under Fire: A Robustness Evaluation of LLM Watermarking· underline

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Digital and Cyber Forensics · Internet Traffic Analysis and Secure E-voting