Towards Anytime-Valid Statistical Watermarking
Baihe Huang, Eric Xu, Kannan Ramchandran, Jiantao Jiao, Michael I. Jordan

TL;DR
This paper introduces a new e-value-based watermarking framework for LLMs that allows for efficient, anytime-valid detection of machine-generated text, overcoming limitations of fixed-horizon methods.
Contribution
It develops the first principled, anytime-valid watermarking method using e-values, enabling adaptive detection with improved sample efficiency.
Findings
Reduces token budget for detection by 13-15% compared to baselines.
Provides a theoretical foundation for optimal sampling and stopping time.
Demonstrates effectiveness through simulations and benchmark evaluations.
Abstract
The proliferation of Large Language Models (LLMs) necessitates efficient mechanisms to distinguish machine-generated content from human text. While statistical watermarking has emerged as a promising solution, existing methods suffer from two critical limitations: the lack of a principled approach for selecting sampling distributions and the reliance on fixed-horizon hypothesis testing, which precludes valid early stopping. In this paper, we bridge this gap by developing the first e-value-based watermarking framework, Anchored E-Watermarking, that unifies optimal sampling with anytime-valid inference. Unlike traditional approaches where optional stopping invalidates Type-I error guarantees, our framework enables valid, anytime-inference by constructing a test supermartingale for the detection process. By leveraging an anchor distribution to approximate the target model, we characterize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning
