Dataset Inference: Ownership Resolution in Machine Learning
Pratyush Maini, Mohammad Yaghini, Nicolas Papernot

TL;DR
This paper introduces dataset inference, a novel method for detecting if a stolen machine learning model contains knowledge from the original training data, providing a robust defense against model stealing attacks.
Contribution
It proposes a new dataset inference technique that leverages statistical testing and decision boundary estimation to identify stolen models containing original training data.
Findings
Achieves over 99% confidence in detecting stolen models.
Effective against state-of-the-art attacks without retraining or overfitting.
Works with limited exposed training points (50) from the stolen model.
Abstract
With increasingly more data and computation involved in their training, machine learning models constitute valuable intellectual property. This has spurred interest in model stealing, which is made more practical by advances in learning with partial, little, or no supervision. Existing defenses focus on inserting unique watermarks in a model's decision surface, but this is insufficient: the watermarks are not sampled from the training distribution and thus are not always preserved during model stealing. In this paper, we make the key observation that knowledge contained in the stolen model's training set is what is common to all stolen copies. The adversary's goal, irrespective of the attack employed, is always to extract this knowledge or its by-products. This gives the original model's owner a strong advantage over the adversary: model owners have access to the original training data.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Advanced Neural Network Applications
