WaterSearch: A Quality-Aware Search-based Watermarking Framework for Large Language Models

Yukang Lin; Jiahao Shao; Shuoran Jiang; Wentao Zhu; Bingjie Lu; Xiangping Wu; Joanna Siebert; Qingcai Chen

arXiv:2512.00837·cs.CL·December 2, 2025

WaterSearch: A Quality-Aware Search-based Watermarking Framework for Large Language Models

Yukang Lin, Jiahao Shao, Shuoran Jiang, Wentao Zhu, Bingjie Lu, Xiangping Wu, Joanna Siebert, Qingcai Chen

PDF

Open Access 4 Reviews

TL;DR

WaterSearch is a novel watermarking framework for large language models that improves text quality and robustness against attacks by optimizing watermark embedding at the sentence level.

Contribution

The paper introduces WaterSearch, a search-based watermarking framework that enhances text quality and robustness, adaptable to various existing methods and effective against multiple attack scenarios.

Findings

01

Achieves 51.01% performance improvement over state-of-the-art baselines.

02

Maintains high detectability under attack scenarios like insertion and paraphrasing.

03

Effective in short text and low-entropy output generation.

Abstract

Watermarking acts as a critical safeguard in text generated by Large Language Models (LLMs). By embedding identifiable signals into model outputs, watermarking enables reliable attribution and enhances the security of machine-generated content. Existing approaches typically embed signals by manipulating token generation probabilities. Despite their effectiveness, these methods inherently face a trade-off between detectability and text quality: the signal strength and randomness required for robust watermarking tend to degrade the performance of downstream tasks. In this paper, we design a novel embedding scheme that controls seed pools to facilitate diverse parallel generation of watermarked text. Based on that scheme, we propose WaterSearch, a sentence-level, search-based watermarking framework adaptable to a wide range of existing methods. WaterSearch enhances text quality by…

Peer Reviews

Decision·ICLR 2026 Conference Withdrawn Submission

Reviewer 01Rating 6Confidence 3

Strengths

1. The shift from a purely token-level manipulation to a chunk-level search-and-select paradigm is a significant and compelling contribution. It elegantly reframes the watermarking problem as a multi-criteria optimization, which directly addresses a well-known limitation of existing methods. 2. The paper provides a theoretical analysis (Theorem 1) linking the macroscopic (sentence-level) selection objective with the microscopic (token-level) watermarking trade-off. This strengthens the methodol

Weaknesses

1. Algorithm 1 requires generating $k$ candidate chunks in parallel at each step. This implies the generation time (latency) will be roughly $k$ times that of a baseline method. The paper's claim of "low computational cost" is misleading as it primarily focuses on memory (KV cache). 2. Algorithm 2 (Detection) appears to require the detector to "Recover the seeds from generation". This suggests the detector must know the exact context $c$ and the random seed generator used during generation. This

Reviewer 02Rating 2Confidence 4

Strengths

1. The idea of using multiple candidates is clear and effective. 2. The paper also includes a solid theoretical analysis of the proposed method. 3. Evaluations are comprehensive, on various models and tasks.

Weaknesses

1. The overhead of this method seems to be significant. 2. A main weakness of the paper is the limited comparison against recent works. The paper mainly compared the original KGW method. More recent and stronger baselines, including both token-level and semantic-level watermarking methods, should be compared for a more comprehensive evaluation. 3. Robustness evaluation is also limited. Stronger modification and paraphrasing attackers, beyond deletion, insertion, and synonym substitution, need

Reviewer 03Rating 4Confidence 4

Strengths

* Simple idea and framework: WaterSearch can be applied on top of existing KGW-style watermarking schemes with minimal modification. * The method improves performance across all evaluated datasets, including difficult cases such as short-text or low-entropy settings, where KGW tends to fail. * The paper uses WaterBench and additional benchmarks (e.g., RepoBench-P) and shows gains across multiple model families. * Figures and algorithm descriptions make the approach easy to follow; the writing is

Weaknesses

* Incremental conceptual novelty: The idea to generate several watermarked candidates and pick the best is intuitive, but very closely resembles beam search or rejection sampling. The contribution feels more engineering-oriented than conceptual, especially given that most of the theoretical development restates expected properties of the existing KGW trade-off. * Computational inefficiency: WaterSearch performs parallel or beam-style generation of multiple watermarked candidates per chunk and s

Reviewer 04Rating 4Confidence 4

Strengths

1. The design of alpha is rigorous overall, with theoretical proof of its effectiveness and ablation studies demonstrating optimal alpha values, reasonably extending the KGW method. 2. Effectively designed time complexity to ensure computational resources increase only moderately. 3. Experiments demonstrate that sufficiently large differences between random seeds enable multiple outputs of the watermark to combine into text with semantics closer to the original meaning, including the validity of

Weaknesses

1. Missing Visualization examples of all results, just an NBA example 2. Scoring q is a linear add-up of semantic similarity towards the original output, and watermarking quality, which is very straight forward, but can be questioned that if the linear add-up is effective or not, more theoritical supports are needed 3. Strategy of picking different random seed is still not clear enough for me

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Hate Speech and Cyberbullying Detection