PoTrojan: powerful neural-level trojan designs in deep learning models
Minhui Zou, Yang Shi, Chengliang Wang, Fangyu Li, WenZhan Song, Yu, Wang

TL;DR
This paper introduces PoTrojan, a novel neural-level Trojan design that can be inserted into pre-trained deep learning models without retraining, potentially causing malicious mispredictions under rare trigger conditions.
Contribution
It presents the first method for designing and inserting powerful, inactive neural Trojans into pre-trained models without altering their architecture or parameters.
Findings
PoTrojans can be triggered under rare conditions to cause misclassification.
The insertion method is efficient and does not require re-training.
PoTrojans pose a significant security threat to AI systems.
Abstract
With the popularity of deep learning (DL), artificial intelligence (AI) has been applied in many areas of human life. Neural network or artificial neural network (NN), the main technique behind DL, has been extensively studied to facilitate computer vision and natural language recognition. However, the more we rely on information technology, the more vulnerable we are. That is, malicious NNs could bring huge threat in the so-called coming AI era. In this paper, for the first time in the literature, we propose a novel approach to design and insert powerful neural-level trojans or PoTrojan in pre-trained NN models. Most of the time, PoTrojans remain inactive, not affecting the normal functions of their host NN models. PoTrojans could only be triggered in very rare conditions. Once activated, however, the PoTrojans could cause the host NN models to malfunction, either falsely predicting or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Physical Unclonable Functions (PUFs) and Hardware Security
