Perturb Your Data: Paraphrase-Guided Training Data Watermarking

Pranav Shetty; Mirazul Haque; Petr Babkin; Zhiqiang Ma; Xiaomo Liu; Manuela Veloso

arXiv:2512.17075·cs.CL·March 25, 2026

Perturb Your Data: Paraphrase-Guided Training Data Watermarking

Pranav Shetty, Mirazul Haque, Petr Babkin, Zhiqiang Ma, Xiaomo Liu, Manuela Veloso

PDF

Open Access

TL;DR

SPECTRA introduces a novel watermarking method for training data that enables reliable detection of data used in training LLMs, even at extremely low data proportions, by paraphrasing and scoring techniques.

Contribution

The paper presents SPECTRA, a scalable watermarking approach that reliably detects training data in LLMs using paraphrasing and scoring, outperforming existing methods.

Findings

01

Achieves over nine orders of magnitude p-value gap in detection

02

Detects training data with less than 0.001% of corpus

03

Survives large-scale LLM training processes

Abstract

Training data detection is critical for enforcing copyright and data licensing, as Large Language Models (LLM) are trained on massive text corpora scraped from the internet. We present SPECTRA, a watermarking approach that makes training data reliably detectable even when it comprises less than 0.001% of the training corpus. SPECTRA works by paraphrasing text using an LLM and assigning a score based on how likely each paraphrase is, according to a separate scoring model. A paraphrase is chosen so that its score closely matches that of the original text, to avoid introducing any distribution shifts. To test whether a suspect model has been trained on the watermarked data, we compare its token probabilities against those of the scoring model. We demonstrate that SPECTRA achieves a consistent p-value gap of over nine orders of magnitude when detecting data used for training versus data not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Adversarial Robustness in Machine Learning · Advanced Graph Neural Networks