Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models
Yige Li, Hanxun Huang, Jiaming Zhang, Xingjun Ma, and Yu-Gang Jiang

TL;DR
This paper introduces a comprehensive two-step defense framework called EBYD that exposes backdoors in neural networks through a novel technique, enabling more effective detection and removal of backdoor triggers across various datasets and attack types.
Contribution
The paper proposes a unified defense system with a novel backdoor exposure technique called Clean Unlearning, improving backdoor detection and removal in neural networks.
Findings
EBYD significantly improves backdoor detection accuracy.
Exposed models enhance the effectiveness of downstream defense tasks.
The framework is validated on diverse datasets and attack types.
Abstract
Backdoor attacks covertly implant triggers into deep neural networks (DNNs) by poisoning a small portion of the training data with pre-designed backdoor triggers. This vulnerability is exacerbated in the era of large models, where extensive (pre-)training on web-crawled datasets is susceptible to compromise. In this paper, we introduce a novel two-step defense framework named Expose Before You Defend (EBYD). EBYD unifies existing backdoor defense methods into a comprehensive defense system with enhanced performance. Specifically, EBYD first exposes the backdoor functionality in the backdoored model through a model preprocessing step called backdoor exposure, and then applies detection and removal methods to the exposed model to identify and eliminate the backdoor features. In the first step of backdoor exposure, we propose a novel technique called Clean Unlearning (CUL), which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Network Security and Intrusion Detection
