TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection

Tom Sander; Hongyan Chang; Tom\'a\v{s} Sou\v{c}ek; Tuan Tran; Valeriu Lacatusu; Sylvestre-Alvise Rebuffi; Alexandre Mourachko; Surya Parimi; Christophe Ropers; Rashel Moritz; Vanessa Stark; Hady Elsahar; Pierre Fernandez

arXiv:2605.12456·cs.CR·May 22, 2026

TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection

Tom Sander, Hongyan Chang, Tom\'a\v{s} Sou\v{c}ek, Tuan Tran, Valeriu Lacatusu, Sylvestre-Alvise Rebuffi, Alexandre Mourachko, Surya Parimi, Christophe Ropers, Rashel Moritz, Vanessa Stark, Hady Elsahar, Pierre Fernandez

PDF

TL;DR

TextSeal is a novel watermarking method for large language models that ensures robust, distortion-free detection of AI-generated text without affecting model performance or inference speed.

Contribution

It introduces a dual-key, localized watermarking scheme that outperforms existing baselines and is robust against dilution and distillation, with no inference overhead.

Findings

01

TextSeal outperforms SynthID-text in detection strength.

02

It maintains downstream performance and perceptual quality.

03

The watermark transfers through model distillation, enabling detection of unauthorized use.

Abstract

We introduce TextSeal, a state-of-the-art watermark for large language models. Building on Gumbel-max sampling, TextSeal introduces dual-key generation to restore output diversity, along with entropy-weighted scoring and multi-region localization for improved detection. It supports serving optimizations such as speculative decoding and multi-token prediction, and does not add any inference overhead. TextSeal strictly dominates baselines like SynthID-text in detection strength and is robust to dilution, maintaining confident localized detection even in heavily mixed human/AI documents. The scheme is theoretically distortion-free, and evaluation across reasoning benchmarks confirms that it preserves downstream performance; while a multilingual human evaluation (6000 A/B comparisons, 5 languages) shows no perceptible quality difference. Beyond its use for provenance detection, TextSeal is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.