Covert Multi-bit LLM Watermarking: An Information Theory and Coding Approach

Sidong Guo; Tyler Kann; Teodora Baluta; Matthieu R. Bloch

arXiv:2605.16709·cs.IT·May 19, 2026

Covert Multi-bit LLM Watermarking: An Information Theory and Coding Approach

Sidong Guo, Tyler Kann, Teodora Baluta, Matthieu R. Bloch

PDF

TL;DR

This paper introduces a novel information-theoretic approach to multi-bit watermarking in large language models, optimizing embedding strategies with polar codes to achieve low error rates and minimal impact on model performance.

Contribution

It provides an exact capacity characterization for multi-bit watermarking in LLMs and develops an explicit polar code-based algorithm for efficient covert embedding.

Findings

01

Achieves a bit-error rate below 10%

02

Attains a watermarking rate of 0.375 bits/token

03

Maintains negligible perplexity and distortion degradation

Abstract

We study the problem of multi-bit watermarking for large language models (LLMs). We introduce a block-autoregressive model inspired by multi-token prediction, in which the encoder has limited non-causal access to token distributions within each block. This formulation enables an information-theoretic characterization of multi-bit watermarking capacity, by which the knowledge of LLM cover statistics is leveraged to enable a multi-bit covert embedding. We study the information-theoretic limits of the model by combining Gelfand-Pinsker and channel synthesis coding techniques and obtain an exact characterization of the capacity. The embedding strategy is further optimized across blocks using a constrained Markov decision process (CMDP) and we develop an explicit algorithm based on polar codes following the information-theoretic principles. Our algorithm achieves a bit-error rate below 10…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.