Watermarking Low-entropy Generation for Large Language Models: An Unbiased and Low-risk Method
Minjia Mao, Dongjun Wei, Zeyu Chen, Xiao Fang, Michael Chau

TL;DR
This paper introduces STA-1, a novel unbiased watermarking method for large language models that maintains output quality, is efficient, robust, and suitable for low-entropy scenarios, enhancing detection without requiring access to the model's internals.
Contribution
The paper proposes STA-1, a new unbiased watermarking technique that preserves token distribution, improves low-entropy output quality, and offers efficient, robust detection without white-box access.
Findings
STA-1 maintains original token distribution in expectation.
STA-1 demonstrates high detection efficiency and robustness.
Experimental results confirm STA-1's effectiveness across datasets.
Abstract
Recent advancements in large language models (LLMs) have highlighted the risk of misusing them, raising the need for accurate detection of LLM-generated content. In response, a viable solution is to inject imperceptible identifiers into LLMs, known as watermarks. Our research extends the existing watermarking methods by proposing the novel Sampling One Then Accepting (STA-1) method. STA-1 is an unbiased watermark that preserves the original token distribution in expectation and has a lower risk of producing unsatisfactory outputs in low-entropy scenarios compared to existing unbiased watermarks. In watermark detection, STA-1 does not require prompts or a white-box LLM, provides statistical guarantees, demonstrates high efficiency in detection time, and remains robust against various watermarking attacks. Experimental results on low-entropy and high-entropy datasets demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
