On Google's SynthID-Text LLM Watermarking System: Theoretical Analysis and Empirical Validation

Romina Omidi; Yun Dong; Binghui Wang

arXiv:2603.03410·cs.CR·March 17, 2026

On Google's SynthID-Text LLM Watermarking System: Theoretical Analysis and Empirical Validation

Romina Omidi, Yun Dong, Binghui Wang

PDF

Open Access

TL;DR

This paper provides a comprehensive theoretical and empirical analysis of Google's SynthID-Text watermarking system, highlighting its detection capabilities, robustness, vulnerabilities, and potential for future improvements in AI-generated text identification.

Contribution

It offers the first theoretical analysis of SynthID-Text, validating its detection performance and robustness, and introduces new insights into watermark removal strategies and system vulnerabilities.

Findings

01

Mean score is vulnerable to increased tournament layers.

02

Bayesian score improves watermark robustness.

03

Optimal detection parameter is set to 0.5.

Abstract

Google's SynthID-Text, the first ever production-ready generative watermark system for large language model, designs a novel Tournament-based method that achieves the state-of-the-art detectability for identifying AI-generated texts. The system's innovation lies in: 1) a new Tournament sampling algorithm for watermarking embedding, 2) a detection strategy based on the introduced score function (e.g., Bayesian or mean score), and 3) a unified design that supports both distortionary and non-distortionary watermarking methods. This paper presents the first theoretical analysis of SynthID-Text, with a focus on its detection performance and watermark robustness, complemented by empirical validation. For example, we prove that the mean score is inherently vulnerable to increased tournament layers, and design a layer inflation attack to break SynthID-Text. We also prove the Bayesian score…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Adversarial Robustness in Machine Learning · Internet Traffic Analysis and Secure E-voting