Advancing Beyond Identification: Multi-bit Watermark for Large Language   Models

KiYoon Yoo; Wonhyuk Ahn; Nojun Kwak

arXiv:2308.00221·cs.CL·March 21, 2024·2 cites

Advancing Beyond Identification: Multi-bit Watermark for Large Language Models

KiYoon Yoo, Wonhyuk Ahn, Nojun Kwak

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a multi-bit watermarking technique for large language models that enables tracing and robustness without model access or finetuning, improving over existing zero-bit methods.

Contribution

It proposes Multi-bit Watermark via Position Allocation, embedding traceable multi-bit information during generation to enhance robustness, traceability, and efficiency.

Findings

01

Outperforms existing methods in robustness and latency.

02

Enables embedding and extraction of long messages without finetuning.

03

Maintains text quality while providing traceability and detection.

Abstract

We show the viability of tackling misuses of large language models beyond the identification of machine-generated text. While existing zero-bit watermark methods focus on detection only, some malicious misuses demand tracing the adversary user for counteracting them. To address this, we propose Multi-bit Watermark via Position Allocation, embedding traceable multi-bit information during language model generation. Through allocating tokens onto different parts of the messages, we embed longer messages in high corruption settings without added latency. By independently embedding sub-units of messages, the proposed method outperforms the existing works in terms of robustness and latency. Leveraging the benefits of zero-bit watermarking, our method enables robust extraction of the watermark without any model access, embedding and extraction of long messages ( $\geq$ 32-bit) without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bangawayoo/mb-lm-watermarking
pytorchOfficial

Videos

Advancing Beyond Identification: Multi-bit Watermark for Large Language Models· underline

Taxonomy

TopicsInternet Traffic Analysis and Secure E-voting · Hate Speech and Cyberbullying Detection · Adversarial Robustness in Machine Learning

MethodsFocus