Optimal Detection for Language Watermarks with Pseudorandom Collision
T. Tony Cai, Xiang Li, Qi Long, Weijie J. Su, Garrett G. Wen

TL;DR
This paper develops a statistical framework for detecting language watermarks in text generated by large language models, accounting for dependencies caused by repetition, and provides optimal detection rules with proven error control.
Contribution
It introduces a hierarchical minimal unit framework and derives closed-form optimal detection rules for watermarks under realistic dependence conditions, advancing the theoretical foundation of watermark detection.
Findings
Improved detection power with rigorous Type I error control.
Repetition-induced dependence affects watermark detection performance.
Optimal detection rules derived for Gumbel-max and inverse-transform watermarks.
Abstract
Text watermarking plays a crucial role in ensuring the traceability and accountability of large language model (LLM) outputs and mitigating misuse. While promising, most existing methods assume perfect pseudorandomness. In practice, repetition in generated text induces collisions that create structured dependence, compromising Type I error control and invalidating standard analyses. We introduce a statistical framework that captures this structure through a hierarchical two-layer partition. At its core is the concept of minimal units -- the smallest groups treatable as independent across units while permitting dependence within. Using minimal units, we define a non-asymptotic efficiency measure and cast watermark detection as a minimax hypothesis testing problem. Applied to Gumbel-max and inverse-transform watermarks, our framework produces closed-form optimal rules. It explains why…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
