Majority Bit-Aware Watermarking For Large Language Models
Jiahao Xu, Rui Hu, Olivera Kotevska, Zikai Zhang

TL;DR
This paper introduces a novel watermarking method for large language models that improves the balance between text quality and watermark detectability by using majority bit-aware encoding, enabling stronger watermarks with larger token sets.
Contribution
The paper proposes majority bit-aware encoding, a new paradigm that enhances watermark robustness and quality in LLM-generated text, with two implementations: MajorMark and MajorMark+.
Findings
Achieves higher decoding accuracy than previous methods.
Maintains superior text quality while embedding watermarks.
Effective on state-of-the-art LLMs.
Abstract
The growing deployment of Large Language Models (LLMs) has raised concerns about their misuse in generating harmful or deceptive content. To address this issue, watermarking methods have been proposed to embed identifiable multi-bit messages into generated text for misuse tracing. However, existing methods often suffer from a fundamental trade-off between text quality and decoding accuracy. In particular, they have to restrict the size of the preferred token set (i.e., green list) during encoding to maintain a detectable watermark signal for decoding, which inevitably degrades generation quality. To improve this trade-off, we propose a novel message encoding paradigm called \textit{majority bit-aware encoding}, which relaxes the watermark signal strength from the green list size. This strategy allows for a strong watermark signal to be preserved in generated texts even when using a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
