Beyond Slow Signs in High-fidelity Model Extraction
Hanna Foerster, Robert Mullins, Ilia Shumailov, Jamie Hayes

TL;DR
This paper enhances model extraction techniques for deep neural networks, significantly improving efficiency and challenging prior assumptions, with practical implications for attacking models trained on standard benchmarks like MNIST.
Contribution
It introduces optimizations to existing extraction methods, identifies the true bottleneck as weight extraction, and proposes new benchmarking approaches for future attacks.
Findings
Extraction time for a 16,721 parameter model reduced to 98 minutes.
Weight extraction, not sign extraction, is the main bottleneck.
Performance improvements up to 14.8 times over previous methods.
Abstract
Deep neural networks, costly to train and rich in intellectual property value, are increasingly threatened by model extraction attacks that compromise their confidentiality. Previous attacks have succeeded in reverse-engineering model parameters up to a precision of float64 for models trained on random data with at most three hidden layers using cryptanalytical techniques. However, the process was identified to be very time consuming and not feasible for larger and deeper models trained on standard benchmarks. Our study evaluates the feasibility of parameter extraction methods of Carlini et al. [1] further enhanced by Canales-Mart\'inez et al. [2] for models trained on standard benchmarks. We introduce a unified codebase that integrates previous methods and reveal that computational tools can significantly influence performance. We develop further optimisations to the end-to-end attack…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsModel Reduction and Neural Networks
