Adaptive Testing for Segmenting Watermarked Texts From Language Models

Xingchi Li; Xiaochi Liu; Guanxun Li

arXiv:2511.06645·stat.ML·November 11, 2025

Adaptive Testing for Segmenting Watermarked Texts From Language Models

Xingchi Li, Xiaochi Liu, Guanxun Li

PDF

Open Access

TL;DR

This paper introduces an adaptive, robust method for segmenting watermarked and non-watermarked texts generated by large language models, improving detection accuracy without requiring precise prompt estimation.

Contribution

It extends likelihood-based watermark detection to adaptive segmentation, removing the need for accurate prompt estimation and enhancing robustness against prompt variability.

Findings

01

Effective segmentation of watermarked and non-watermarked text segments

02

Robust performance without precise prompt estimation

03

Improved accuracy over previous methods

Abstract

The rapid adoption of large language models (LLMs), such as GPT-4 and Claude 3.5, underscores the need to distinguish LLM-generated text from human-written content to mitigate the spread of misinformation and misuse in education. One promising approach to address this issue is the watermark technique, which embeds subtle statistical signals into LLM-generated text to enable reliable identification. In this paper, we first generalize the likelihood-based LLM detection method of a previous study by introducing a flexible weighted formulation, and further adapt this approach to the inverse transform sampling method. Moving beyond watermark detection, we extend this adaptive detection strategy to tackle the more challenging problem of segmenting a given text into watermarked and non-watermarked substrings. In contrast to the approach in a previous study, which relies on accurate estimation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Misinformation and Its Impacts · Academic integrity and plagiarism