Practical Detection of Trojan Neural Networks: Data-Limited and   Data-Free Cases

Ren Wang; Gaoyuan Zhang; Sijia Liu; Pin-Yu Chen; Jinjun Xiong; Meng; Wang

arXiv:2007.15802·cs.LG·August 3, 2020·20 cites

Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases

Ren Wang, Gaoyuan Zhang, Sijia Liu, Pin-Yu Chen, Jinjun Xiong, Meng, Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces novel methods for detecting Trojan neural networks in scenarios with limited or no access to training data, leveraging model responses and adversarial attack connections to improve trustworthiness.

Contribution

It proposes a data-limited TrojanNet detector using adversarial attack insights and a data-free detector based on internal neuron responses, advancing Trojan detection in data-scarce settings.

Findings

01

Data-limited TND effectively detects TrojanNets with few samples.

02

Data-free TND identifies Trojan behavior using internal neuron responses.

03

Methods are validated on CIFAR-10, GTSRB, and ImageNet datasets.

Abstract

When the training data are maliciously tampered, the predictions of the acquired deep neural network (DNN) can be manipulated by an adversary known as the Trojan attack (or poisoning backdoor attack). The lack of robustness of DNNs against Trojan attacks could significantly harm real-life machine learning (ML) systems in downstream applications, therefore posing widespread concern to their trustworthiness. In this paper, we study the problem of the Trojan network (TrojanNet) detection in the data-scarce regime, where only the weights of a trained DNN are accessed by the detector. We first propose a data-limited TrojanNet detector (TND), when only a few data samples are available for TrojanNet detection. We show that an effective data-limited TND can be established by exploring connections between Trojan attack and prediction-evasion adversarial attacks including per-sample attack as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wangren09/TrojanNetDetector
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications