Proving membership in LLM pretraining data via data watermarks

Johnny Tian-Zheng Wei; Ryan Yixiang Wang; Robin Jia

arXiv:2402.10892·cs.CR·August 20, 2024·2 cites

Proving membership in LLM pretraining data via data watermarks

Johnny Tian-Zheng Wei, Ryan Yixiang Wang, Robin Jia

PDF

Open Access 1 Video

TL;DR

This paper introduces data watermarks as a method to detect if copyrighted works were used in training large language models, providing a principled hypothesis testing framework with guarantees on false positives.

Contribution

It proposes a novel watermarking technique for training data, analyzes how watermark design affects detection power, and demonstrates robustness under model and dataset scaling.

Findings

01

Watermarks remain detectable even as dataset size increases with model size.

02

Detection guarantees are provided through hypothesis testing framework.

03

SHA hashes can be robustly detected in large models' training data.

Abstract

Detecting whether copyright holders' works were used in LLM pretraining is poised to be an important problem. This work proposes using data watermarks to enable principled detection with only black-box model access, provided that the rightholder contributed multiple training documents and watermarked them before public release. By applying a randomly sampled data watermark, detection can be framed as hypothesis testing, which provides guarantees on the false detection rate. We study two watermarks: one that inserts random sequences, and another that randomly substitutes characters with Unicode lookalikes. We first show how three aspects of watermark design -- watermark length, number of duplications, and interference -- affect the power of the hypothesis test. Next, we study how a watermark's detection strength changes under model and dataset scaling: while increasing the dataset size…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Proving membership in LLM pretraining data via data watermarks· underline

Taxonomy

TopicsScientific Computing and Data Management · Research Data Management Practices · Data Quality and Management