Backdoor Defense via Suppressing Model Shortcuts

Sheng Yang; Yiming Li; Yong Jiang; Shu-Tao Xia

arXiv:2211.05631·cs.CV·March 7, 2023

Backdoor Defense via Suppressing Model Shortcuts

Sheng Yang, Yiming Li, Yong Jiang, Shu-Tao Xia

PDF

Open Access 1 Repo

TL;DR

This paper proposes a novel backdoor defense method that suppresses key skip connections in neural networks to reduce attack success rates while maintaining accuracy, validated through extensive experiments.

Contribution

It introduces a new defense approach targeting model shortcuts by suppressing skip connections, improving backdoor removal effectiveness.

Findings

01

Significant decrease in attack success rate when suppressing skip connections.

02

Effective backdoor removal with minimal impact on benign accuracy.

03

Validated on benchmark datasets with extensive experiments.

Abstract

Recent studies have demonstrated that deep neural networks (DNNs) are vulnerable to backdoor attacks during the training process. Specifically, the adversaries intend to embed hidden backdoors in DNNs so that malicious model predictions can be activated through pre-defined trigger patterns. In this paper, we explore the backdoor mechanism from the angle of the model structure. We select the skip connection for discussions, inspired by the understanding that it helps the learning of model `shortcuts' where backdoor triggers are usually easier to be learned. Specifically, we demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections. Based on this observation, we design a simple yet effective backdoor removal method by suppressing the skip connections in critical layers selected by our method. We also implement fine-tuning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

20000yshust/Backdoor-Defense-Via-Suppressing-Model-Shortcuts
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications