Let Watermarks Speak: A Robust and Unforgeable Watermark for Language   Models

Minhao Bai

arXiv:2412.19603·cs.CR·December 30, 2024

Let Watermarks Speak: A Robust and Unforgeable Watermark for Language Models

Minhao Bai

PDF

Open Access

TL;DR

This paper introduces a novel, robust, and unforgeable single-bit watermarking scheme for language models that can embed multiple watermark signals, enhancing content traceability and integrity verification.

Contribution

First to propose an undetectable, robust, single-bit watermarking scheme capable of embedding two different signals for language models.

Findings

01

Achieves comparable robustness to advanced zero-bit schemes.

02

Constructs a multi-bit scheme using prompt hash or generated content as watermark signals.

03

Demonstrates practical effectiveness and robustness through experiments.

Abstract

Watermarking is an effective way to trace model-generated content. Current watermark methods cannot resist forgery attacks, such as a deceptive claim that the model-generated content is a response to a fabricated prompt. None of them can be made unforgeable without degrading robustness. Unforgeability demands that the watermarked output is not only detectable but also verifiable for integrity, indicating whether it has been modified. This underscores the necessity and significance of a multi-bit watermarking scheme. Recent works try to build multi-bit scheme based on existing zero-bit watermarking scheme, but they either degrades the robustness or brings a significant computational burden. We aim to design a novel single-bit watermark scheme, which provides the ability to embed 2 different watermark signals. This paper's main contribution is that we are the first to propose an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Advanced Steganography and Watermarking Techniques · Music and Audio Processing