StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models

Ya Jiang; Chuxiong Wu; Massieh Kordi Boroujeny; Brian Mark; Kai Zeng

arXiv:2506.05502·cs.CR·June 9, 2025

StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models

Ya Jiang, Chuxiong Wu, Massieh Kordi Boroujeny, Brian Mark, Kai Zeng

PDF

Open Access 1 Video

TL;DR

StealthInk introduces a novel multi-bit watermarking method for large language models that maintains text quality, embeds provenance data, and allows fast, reliable detection without API access.

Contribution

It presents a stealthy, multi-bit watermarking scheme that preserves original text distribution and enhances traceability for LLMs, surpassing existing zero-bit methods.

Findings

01

High detectability and resilience demonstrated across tasks

02

Preserves original text distribution effectively

03

Enables embedding of multiple bits of information

Abstract

Watermarking for large language models (LLMs) offers a promising approach to identifying AI-generated text. Existing approaches, however, either compromise the distribution of original generated text by LLMs or are limited to embedding zero-bit information that only allows for watermark detection but ignores identification. We present StealthInk, a stealthy multi-bit watermarking scheme that preserves the original text distribution while enabling the embedding of provenance data, such as userID, TimeStamp, and modelID, within LLM-generated text. This enhances fast traceability without requiring access to the language model's API or prompts. We derive a lower bound on the number of tokens necessary for watermark detection at a fixed equal error rate, which provides insights on how to enhance the capacity. Comprehensive empirical evaluations across diverse tasks highlight the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Scientific Computing and Data Management · Generative Adversarial Networks and Image Synthesis