TL;DR
This paper demonstrates that it is possible to effectively copy a black-box CNN model by querying it with random non-labeled data and training a new model on the generated predictions, achieving near-original performance.
Contribution
It introduces a novel method of model stealing using random non-labeled data to create high-performance copycat CNNs without access to original training data.
Findings
Copycat CNNs achieved over 93.7% of target performance with non-problem domain data.
Copycat CNNs achieved over 98.6% of target performance with problem domain data.
Successfully copied Microsoft Azure Emotion API with at least 97.3% performance.
Abstract
In the past few years, Convolutional Neural Networks (CNNs) have been achieving state-of-the-art performance on a variety of problems. Many companies employ resources and money to generate these models and provide them as an API, therefore it is in their best interest to protect them, i.e., to avoid that someone else copies them. Recent studies revealed that state-of-the-art CNNs are vulnerable to adversarial examples attacks, and this weakness indicates that CNNs do not need to operate in the problem domain (PD). Therefore, we hypothesize that they also do not need to be trained with examples of the PD in order to operate in it. Given these facts, in this paper, we investigate if a target black-box CNN can be copied by persuading it to confess its knowledge through random non-labeled data. The copy is two-fold: i) the target network is queried with random data and its predictions are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
