MarkTune: Improving the Quality-Detectability Trade-off in Open-Weight LLM Watermarking

Yizhou Zhao; Zhiwei Steven Wu; Adam Block

arXiv:2512.04044·cs.LG·December 4, 2025

MarkTune: Improving the Quality-Detectability Trade-off in Open-Weight LLM Watermarking

Yizhou Zhao, Zhiwei Steven Wu, Adam Block

PDF

Open Access

TL;DR

MarkTune is a fine-tuning framework that enhances watermark detectability in open-weight language models while maintaining high text quality, addressing limitations of existing watermarking methods.

Contribution

It introduces a theoretically grounded, on-policy fine-tuning method that improves the quality-detectability trade-off for open-weight watermarking techniques like GaussMark.

Findings

01

MarkTune significantly improves watermark detection power without degrading text quality.

02

It achieves robustness against paraphrasing and fine-tuning attacks.

03

The method generalizes well across different datasets and models.

Abstract

Watermarking aims to embed hidden signals in generated text that can be reliably detected when given access to a secret key. Open-weight language models pose acute challenges for such watermarking schemes because the inference-time interventions that dominate contemporary approaches cannot be enforced once model weights are public. Existing watermaking techniques for open-weight models, such as the recently proposed GaussMark, typically rely on small modifications to model weights, which can yield signals detectable to those equipped with a secret key, but achieving detection power comparable to inference-time watermarks generally requires weight perturbations that noticeably reduce generation quality. We introduce MarkTune, a theoretically principled, on-policy fine-tuning framework that treats the GaussMark signal as a reward while simultaneously regularizing against degradation in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Steganography and Watermarking Techniques · Advanced Malware Detection Techniques