PRADA: Protecting against DNN Model Stealing Attacks
Mika Juuti, Sebastian Szyller, Samuel Marchal, N. Asokan

TL;DR
This paper introduces PRADA, a novel detection method for DNN model stealing attacks that analyzes query distributions, and demonstrates that it effectively detects various extraction attacks without false positives.
Contribution
The paper presents PRADA, the first generic detection approach for DNN model extraction attacks based on query distribution analysis, improving security in ML APIs.
Findings
PRADA detects all tested model extraction attacks with no false positives.
New attack methods outperform previous techniques in transferability and accuracy.
Proposed attacks demonstrate significant improvements over state-of-the-art in model extraction.
Abstract
Machine learning (ML) applications are increasingly prevalent. Protecting the confidentiality of ML models becomes paramount for two reasons: (a) a model can be a business advantage to its owner, and (b) an adversary may use a stolen model to find transferable adversarial examples that can evade classification by the original model. Access to the model can be restricted to be only via well-defined prediction APIs. Nevertheless, prediction APIs still provide enough information to allow an adversary to mount model extraction attacks by sending repeated queries via the prediction API. In this paper, we describe new model extraction attacks using novel approaches for generating synthetic queries, and optimizing training hyperparameters. Our attacks outperform state-of-the-art model extraction in terms of transferability of both targeted and non-targeted adversarial examples (up to +29-44…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Machine Learning and Algorithms
