MC$^2$Mark: Distortion-Free Multi-Bit Watermarking for Long Messages
Xuehao Cui, Ruibo Chen, Yihan Wu, Heng Huang

TL;DR
MC$^2$Mark is a novel distortion-free multi-bit watermarking framework that reliably embeds and decodes long messages in generated text, enhancing detectability and robustness without compromising quality.
Contribution
It introduces a new watermarking method combining Multi-Channel Colored Reweighting and Multi-Layer Sequential Reweighting for long message embedding in language models.
Findings
Achieves near-perfect accuracy for short messages.
Exceeds second-best method by nearly 30% for long messages.
Improves detectability and robustness over prior methods.
Abstract
Large language models now produce text indistinguishable from human writing, which increases the need for reliable provenance tracing. Multi-bit watermarking can embed identifiers into generated text, but existing methods struggle to keep both text quality and watermark strength while carrying long messages. We propose MCMark, a distortion-free multi-bit watermarking framework designed for reliable embedding and decoding of long messages. Our key technical idea is Multi-Channel Colored Reweighting, which encodes bits through structured token reweighting while keeping the token distribution unbiased, together with Multi-Layer Sequential Reweighting to strengthen the watermark signal and an evidence-accumulation detector for message recovery. Experiments show that MCMark improves detectability and robustness over prior multi-bit watermarking methods while preserving generation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning
