Covert Multi-bit LLM Watermarking: An Information Theory and Coding Approach
Sidong Guo, Tyler Kann, Teodora Baluta, Matthieu R. Bloch

TL;DR
This paper introduces a novel information-theoretic approach to multi-bit watermarking in large language models, optimizing embedding strategies with polar codes to achieve low error rates and minimal impact on model performance.
Contribution
It provides an exact capacity characterization for multi-bit watermarking in LLMs and develops an explicit polar code-based algorithm for efficient covert embedding.
Findings
Achieves a bit-error rate below 10%
Attains a watermarking rate of 0.375 bits/token
Maintains negligible perplexity and distortion degradation
Abstract
We study the problem of multi-bit watermarking for large language models (LLMs). We introduce a block-autoregressive model inspired by multi-token prediction, in which the encoder has limited non-causal access to token distributions within each block. This formulation enables an information-theoretic characterization of multi-bit watermarking capacity, by which the knowledge of LLM cover statistics is leveraged to enable a multi-bit covert embedding. We study the information-theoretic limits of the model by combining Gelfand-Pinsker and channel synthesis coding techniques and obtain an exact characterization of the capacity. The embedding strategy is further optimized across blocks using a constrained Markov decision process (CMDP) and we develop an explicit algorithm based on polar codes following the information-theoretic principles. Our algorithm achieves a bit-error rate below 10…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
