On the Limitations of Model Stealing with Uncertainty Quantification Models
David Pape, Sina D\"aubener, Thorsten Eisenhofer, Antonio Emanuele, Cin\`a, Lea Sch\"onherr

TL;DR
This paper investigates the impact of uncertainty quantification on model stealing, finding that current methods provide limited improvements due to low diversity in model predictions during training.
Contribution
It compares five uncertainty quantification methods in model stealing and analyzes why they offer minimal gains, highlighting the low diversity of model predictions during training.
Findings
Uncertainty quantification models only marginally improve fidelity.
Models tend to have similar predictions during training.
Low diversity limits the effectiveness of uncertainty-based stealing.
Abstract
Model stealing aims at inferring a victim model's functionality at a fraction of the original training cost. While the goal is clear, in practice the model's architecture, weight dimension, and original training data can not be determined exactly, leading to mutual uncertainty during stealing. In this work, we explicitly tackle this uncertainty by generating multiple possible networks and combining their predictions to improve the quality of the stolen model. For this, we compare five popular uncertainty quantification models in a model stealing task. Surprisingly, our results indicate that the considered models only lead to marginal improvements in terms of label agreement (i.e., fidelity) to the stolen model. To find the cause of this, we inspect the diversity of the model's prediction by looking at the prediction variance as a function of training iterations. We realize that during…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
