BeniFul: Backdoor Defense via Middle Feature Analysis for Deep Neural Networks
Xinfu Li, Junying Zhang, Xindi Ma

TL;DR
BeniFul is a novel backdoor defense method for deep neural networks that analyzes middle-layer features to detect and eliminate backdoors, demonstrating strong performance against multiple attacks on CIFAR-10 and Tiny ImageNet.
Contribution
The paper introduces BeniFul, a comprehensive backdoor defense approach using middle feature analysis, combining detection and elimination with novel metrics.
Findings
Effective detection of backdoor inputs using Variational Auto-Encoder reconstruction distance.
Successful backdoor elimination with feature distance loss.
Strong defense performance against five state-of-the-art attacks.
Abstract
Backdoor defenses have recently become important in resisting backdoor attacks in deep neural networks (DNNs), where attackers implant backdoors into the DNN model by injecting backdoor samples into the training dataset. Although there are many defense methods to achieve backdoor detection for DNN inputs and backdoor elimination for DNN models, they still have not presented a clear explanation of the relationship between these two missions. In this paper, we use the features from the middle layer of the DNN model to analyze the difference between backdoor and benign samples and propose Backdoor Consistency, which indicates that at least one backdoor exists in the DNN model if the backdoor trigger is detected exactly on input. By analyzing the middle features, we design an effective and comprehensive backdoor defense method named BeniFul, which consists of two parts: a gray-box backdoor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning
