Towards Anytime-Valid Statistical Watermarking

Baihe Huang; Eric Xu; Kannan Ramchandran; Jiantao Jiao; Michael I. Jordan

arXiv:2602.17608·cs.LG·February 20, 2026

Towards Anytime-Valid Statistical Watermarking

Baihe Huang, Eric Xu, Kannan Ramchandran, Jiantao Jiao, Michael I. Jordan

PDF

Open Access

TL;DR

This paper introduces a new e-value-based watermarking framework for LLMs that allows for efficient, anytime-valid detection of machine-generated text, overcoming limitations of fixed-horizon methods.

Contribution

It develops the first principled, anytime-valid watermarking method using e-values, enabling adaptive detection with improved sample efficiency.

Findings

01

Reduces token budget for detection by 13-15% compared to baselines.

02

Provides a theoretical foundation for optimal sampling and stopping time.

03

Demonstrates effectiveness through simulations and benchmark evaluations.

Abstract

The proliferation of Large Language Models (LLMs) necessitates efficient mechanisms to distinguish machine-generated content from human text. While statistical watermarking has emerged as a promising solution, existing methods suffer from two critical limitations: the lack of a principled approach for selecting sampling distributions and the reliance on fixed-horizon hypothesis testing, which precludes valid early stopping. In this paper, we bridge this gap by developing the first e-value-based watermarking framework, Anchored E-Watermarking, that unifies optimal sampling with anytime-valid inference. Unlike traditional approaches where optional stopping invalidates Type-I error guarantees, our framework enables valid, anytime-inference by constructing a test supermartingale for the detection process. By leveraging an anchor distribution to approximate the target model, we characterize…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning