Data Isotopes for Data Provenance in DNNs
Emily Wenger, Xiuyu Li, Ben Y. Zhao, Vitaly Shmatikov

TL;DR
This paper introduces a system that allows users to verify if their data was used in training a DNN by creating unique isotopes that induce detectable spurious features, turning model vulnerabilities into a data provenance tool.
Contribution
The paper presents a practical method for data provenance in DNNs using isotopes, enabling detection of data usage without requiring access to training data or process details.
Findings
High accuracy in detecting isotopes across multiple settings
Effective on large models like ImageNet and public ML platforms
Robust against adaptive countermeasures
Abstract
Today, creators of data-hungry deep neural networks (DNNs) scour the Internet for training fodder, leaving users with little control over or knowledge of when their data is appropriated for model training. To empower users to counteract unwanted data use, we design, implement and evaluate a practical system that enables users to detect if their data was used to train an DNN model. We show how users can create special data points we call isotopes, which introduce "spurious features" into DNNs during training. With only query access to a trained model and no knowledge of the model training process, or control of the data labels, a user can apply statistical hypothesis testing to detect if a model has learned the spurious features associated with their isotopes by training on the user's data. This effectively turns DNNs' vulnerability to memorization and spurious correlations into a tool…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)
