Dataset Watermarking for Closed LLMs with Provable Detection

Pengrun Huang; Kamalika Chaudhuri; Yu-Xiang Wang

arXiv:2605.06865·cs.LG·May 11, 2026

Dataset Watermarking for Closed LLMs with Provable Detection

Pengrun Huang, Kamalika Chaudhuri, Yu-Xiang Wang

PDF

TL;DR

This paper presents a novel dataset watermarking technique for closed large language models that enables provable detection of proprietary training data signatures with minimal impact on model utility.

Contribution

It introduces the first provable dataset watermarking method for closed LLMs, embedding detectable signatures via co-occurrence frequency manipulation.

Findings

01

Reliable watermark detection with p < 0.01 in fine-tuning

02

Effective watermark detection even when watermarked data is 1% of total tokens

03

Preserves model utility and semantic integrity

Abstract

Large language models (LLMs) are pre-trained and post-trained on vast amounts of loosely curated data, raising the possibility that these models may have been trained on proprietary datasets or the same benchmarks used for evaluation. This motivates the need for dataset watermarking: designing datasets such that training on them leaves detectable signatures in the resulting model. Prior work has explored this problem for open models. We introduce the first dataset watermarking method for closed LLMs with provable detection. In particular, we embed a dataset-level watermark signal by increasing the co-occurrence frequency of randomly selected word pairs through rephrasing, and detect it using a statistical test on co-occurrence patterns in model-generated outputs. We evaluate our method with multiple base models and benchmark datasets and show that it reliably detects the watermark ($p…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.