Stealing Machine Learning Models via Prediction APIs
Florian Tram\`er, Fan Zhang, Ari Juels, Michael K. Reiter, Thomas, Ristenpart

TL;DR
This paper demonstrates that machine learning models deployed via prediction APIs can be efficiently stolen using black-box attacks, even with partial outputs, highlighting significant security concerns.
Contribution
It introduces simple, effective model extraction attacks applicable to popular ML models and evaluates their success against real online services, revealing vulnerabilities.
Findings
High-fidelity model extraction achievable with simple attacks
Attacks effective even when confidence scores are omitted
Real-world services like BigML and Amazon ML are vulnerable
Abstract
Machine learning (ML) models may be deemed confidential due to their sensitive training data, commercial value, or use in security applications. Increasingly often, confidential ML models are being deployed with publicly accessible query interfaces. ML-as-a-service ("predictive analytics") systems are an example: Some allow users to train models on potentially sensitive data and charge others for access on a pay-per-query basis. The tension between model confidentiality and public access motivates our investigation of model extraction attacks. In such attacks, an adversary with black-box access, but no prior knowledge of an ML model's parameters or training data, aims to duplicate the functionality of (i.e., "steal") the model. Unlike in classical learning theory settings, ML-as-a-service offerings may accept partial feature vectors as inputs and include confidence values with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Network Security and Intrusion Detection
