Preventing Distillation-based Attacks on Neural Network IP

Mahdieh Grailoo; Zain Ul Abideen; Mairo Leier; Samuel Pagliarini

arXiv:2204.00292·cs.CR·April 4, 2022

Preventing Distillation-based Attacks on Neural Network IP

Mahdieh Grailoo, Zain Ul Abideen, Mairo Leier, Samuel Pagliarini

PDF

Open Access

TL;DR

This paper introduces a novel poisoning method to protect neural network IP in hardware by preventing distillation-based attacks, effectively reducing model theft risk without high overheads or toolchain modifications.

Contribution

It is the first to propose a poisoning approach specifically designed to defend hardware-implemented neural networks against distillation attacks.

Findings

01

Poisoning significantly reduces stolen model accuracy

02

Maintains original prediction distributions and functionality

03

No high overheads or toolchain modifications required

Abstract

Neural networks (NNs) are already deployed in hardware today, becoming valuable intellectual property (IP) as many hours are invested in their training and optimization. Therefore, attackers may be interested in copying, reverse engineering, or even modifying this IP. The current practices in hardware obfuscation, including the widely studied logic locking technique, are insufficient to protect the actual IP of a well-trained NN: its weights. Simply hiding the weights behind a key-based scheme is inefficient (resource-hungry) and inadequate (attackers can exploit knowledge distillation). This paper proposes an intuitive method to poison the predictions that prevent distillation-based attacks; this is the first work to consider such a poisoning approach in hardware-implemented NNs. The proposed technique obfuscates a NN so an attacker cannot train the NN entirely or accurately. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security · Integrated Circuits and Semiconductor Failure Analysis