More Haste, Less Speed: Weaker Single-Layer Watermark Improves Distortion-Free Watermark Ensembles

Ruibo Chen; Yihan Wu; Xuehao Cui; Jingqi Zhang; Heng Huang

arXiv:2602.11793·cs.CR·February 13, 2026

More Haste, Less Speed: Weaker Single-Layer Watermark Improves Distortion-Free Watermark Ensembles

Ruibo Chen, Yihan Wu, Xuehao Cui, Jingqi Zhang, Heng Huang

PDF

Open Access

TL;DR

This paper reveals that weaker single-layer watermarks, which preserve entropy, can improve the effectiveness of watermark ensembles in detecting large language model content, contrary to the traditional stronger-is-better approach.

Contribution

It introduces a novel framework that employs weaker watermarks to maintain entropy, enhancing ensemble robustness and detectability in watermarking large language models.

Findings

01

Weaker watermarks preserve entropy and improve ensemble detection.

02

Strong watermarks reduce entropy and weaken multi-layer detectability.

03

Empirical results show improved robustness with weaker watermark strategies.

Abstract

Watermarking has emerged as a crucial technique for detecting and attributing content generated by large language models. While recent advancements have utilized watermark ensembles to enhance robustness, prevailing methods typically prioritize maximizing the strength of the watermark at every individual layer. In this work, we identify a critical limitation in this "stronger-is-better" approach: strong watermarks significantly reduce the entropy of the token distribution, which paradoxically weakens the effectiveness of watermarking in subsequent layers. We theoretically and empirically show that detectability is bounded by entropy and that watermark ensembles induce a monotonic decrease in both entropy and the expected green-list ratio across layers. To address this inherent trade-off, we propose a general framework that utilizes weaker single-layer watermarks to preserve the entropy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Steganography and Watermarking Techniques · Generative Adversarial Networks and Image Synthesis