UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening

Siyuan Cheng; Guangyu Shen; Kaiyuan Zhang; Guanhong Tao; Shengwei An,; Hanxi Guo; Shiqing Ma; Xiangyu Zhang

arXiv:2407.11372·cs.CR·July 17, 2024

UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening

Siyuan Cheng, Guangyu Shen, Kaiyuan Zhang, Guanhong Tao, Shengwei An,, Hanxi Guo, Shiqing Ma, Xiangyu Zhang

PDF

Open Access 1 Repo

TL;DR

This paper presents UNIT, a post-training defense method that effectively mitigates backdoor attacks in neural networks by tightening neuron activation distributions and removing anomalously large values, outperforming existing defenses.

Contribution

UNIT introduces a novel approach to eliminate backdoor effects by approximating and constraining neuron activation distributions, effective against recent advanced attacks.

Findings

01

Outperforms 7 popular defense methods against 14 backdoor attacks.

02

Effective with only 5% of clean training data.

03

Cost-efficient and applicable post-training.

Abstract

Deep neural networks (DNNs) have demonstrated effectiveness in various fields. However, DNNs are vulnerable to backdoor attacks, which inject a unique pattern, called trigger, into the input to cause misclassification to an attack-chosen target label. While existing works have proposed various methods to mitigate backdoor effects in poisoned models, they tend to be less effective against recent advanced attacks. In this paper, we introduce a novel post-training defense technique UNIT that can effectively eliminate backdoor effects for a variety of attacks. In specific, UNIT approximates a unique and tight activation distribution for each neuron in the model. It then proactively dispels substantially large activation values that exceed the approximated boundaries. Our experimental results demonstrate that UNIT outperforms 7 popular defense methods against 14 existing backdoor attacks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

megum1/unit
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications