Backdoor Vulnerabilities in Normally Trained Deep Learning Models
Guanhong Tao, Zhenting Wang, Siyuan Cheng, Shiqing Ma, Shengwei An,, Yingqi Liu, Guangyu Shen, Zhuo Zhang, Yunshu Mao, Xiangyu Zhang

TL;DR
This paper systematically studies natural backdoor vulnerabilities in normally trained deep learning models, revealing their widespread existence, categorizing them, and proposing a detection framework that outperforms existing methods.
Contribution
It introduces the concept of natural backdoors in normally trained models, categorizes them, and develops a detection framework that identifies significantly more backdoors than existing scanners.
Findings
Natural backdoors are prevalent in publicly available models.
The proposed detection framework finds 315 backdoors, outperforming existing scanners.
Most injected backdoor attacks have natural correspondences in trained models.
Abstract
We conduct a systematic study of backdoor vulnerabilities in normally trained Deep Learning models. They are as dangerous as backdoors injected by data poisoning because both can be equally exploited. We leverage 20 different types of injected backdoor attacks in the literature as the guidance and study their correspondences in normally trained models, which we call natural backdoor vulnerabilities. We find that natural backdoors are widely existing, with most injected backdoor attacks having natural correspondences. We categorize these natural backdoors and propose a general detection framework. It finds 315 natural backdoors in the 56 normally trained models downloaded from the Internet, covering all the different categories, while existing scanners designed for injected backdoors can at most detect 65 backdoors. We also study the root causes and defense of natural backdoors.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
