Preventing Distillation-based Attacks on Neural Network IP
Mahdieh Grailoo, Zain Ul Abideen, Mairo Leier, Samuel Pagliarini

TL;DR
This paper introduces a novel poisoning method to protect neural network IP in hardware by preventing distillation-based attacks, effectively reducing model theft risk without high overheads or toolchain modifications.
Contribution
It is the first to propose a poisoning approach specifically designed to defend hardware-implemented neural networks against distillation attacks.
Findings
Poisoning significantly reduces stolen model accuracy
Maintains original prediction distributions and functionality
No high overheads or toolchain modifications required
Abstract
Neural networks (NNs) are already deployed in hardware today, becoming valuable intellectual property (IP) as many hours are invested in their training and optimization. Therefore, attackers may be interested in copying, reverse engineering, or even modifying this IP. The current practices in hardware obfuscation, including the widely studied logic locking technique, are insufficient to protect the actual IP of a well-trained NN: its weights. Simply hiding the weights behind a key-based scheme is inefficient (resource-hungry) and inadequate (attackers can exploit knowledge distillation). This paper proposes an intuitive method to poison the predictions that prevent distillation-based attacks; this is the first work to consider such a poisoning approach in hardware-implemented NNs. The proposed technique obfuscates a NN so an attacker cannot train the NN entirely or accurately. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security · Integrated Circuits and Semiconductor Failure Analysis
