Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models
Mingjia Huo, Sai Ashish Somayajula, Youwei Liang, Ruisi Zhang, Farinaz, Koushanfar, Pengtao Xie

TL;DR
This paper presents a novel multi-objective optimization watermarking method for large language models that improves detectability while preserving the semantic quality of generated texts.
Contribution
It introduces a lightweight, token-specific watermarking approach using multi-objective optimization to enhance both detectability and semantic coherence.
Findings
Outperforms existing watermarking methods in detectability.
Maintains high semantic coherence in watermarked texts.
Demonstrates effectiveness on large language models.
Abstract
Large language models generate high-quality responses with potential misinformation, underscoring the need for regulation by distinguishing AI-generated and human-written texts. Watermarking is pivotal in this context, which involves embedding hidden markers in texts during the LLM inference phase, which is imperceptible to humans. Achieving both the detectability of inserted watermarks and the semantic quality of generated texts is challenging. While current watermarking algorithms have made promising progress in this direction, there remains significant scope for improvement. To address these challenges, we introduce a novel multi-objective optimization (MOO) approach for watermarking that utilizes lightweight networks to generate token-specific watermarking logits and splitting ratios. By leveraging MOO to optimize for both detection and semantic objective functions, our method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Internet Traffic Analysis and Secure E-voting · Generative Adversarial Networks and Image Synthesis
