Token-Specific Watermarking with Enhanced Detectability and Semantic   Coherence for Large Language Models

Mingjia Huo; Sai Ashish Somayajula; Youwei Liang; Ruisi Zhang; Farinaz; Koushanfar; Pengtao Xie

arXiv:2402.18059·cs.LG·June 7, 2024·3 cites

Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models

Mingjia Huo, Sai Ashish Somayajula, Youwei Liang, Ruisi Zhang, Farinaz, Koushanfar, Pengtao Xie

PDF

Open Access 1 Repo

TL;DR

This paper presents a novel multi-objective optimization watermarking method for large language models that improves detectability while preserving the semantic quality of generated texts.

Contribution

It introduces a lightweight, token-specific watermarking approach using multi-objective optimization to enhance both detectability and semantic coherence.

Findings

01

Outperforms existing watermarking methods in detectability.

02

Maintains high semantic coherence in watermarked texts.

03

Demonstrates effectiveness on large language models.

Abstract

Large language models generate high-quality responses with potential misinformation, underscoring the need for regulation by distinguishing AI-generated and human-written texts. Watermarking is pivotal in this context, which involves embedding hidden markers in texts during the LLM inference phase, which is imperceptible to humans. Achieving both the detectability of inserted watermarks and the semantic quality of generated texts is challenging. While current watermarking algorithms have made promising progress in this direction, there remains significant scope for improvement. To address these challenges, we introduce a novel multi-objective optimization (MOO) approach for watermarking that utilizes lightweight networks to generate token-specific watermarking logits and splitting ratios. By leveraging MOO to optimize for both detection and semantic objective functions, our method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mignonjia/ts_watermark
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Internet Traffic Analysis and Secure E-voting · Generative Adversarial Networks and Image Synthesis