NBA: defensive distillation for backdoor removal via neural behavior   alignment

Zonghao Ying; Bin Wu

arXiv:2406.10846·cs.CR·June 18, 2024

NBA: defensive distillation for backdoor removal via neural behavior alignment

Zonghao Ying, Bin Wu

PDF

TL;DR

This paper introduces Neural Behavioral Alignment (NBA), a novel defense method that effectively removes backdoors from neural networks by aligning their neural behaviors during knowledge distillation.

Contribution

NBA is a new defense mechanism that enhances backdoor removal by optimizing knowledge transfer and behavior alignment between models.

Findings

01

NBA defends against six different backdoor attacks.

02

NBA outperforms five state-of-the-art defenses.

03

NBA effectively removes backdoors while maintaining model accuracy.

Abstract

Recently, deep neural networks have been shown to be vulnerable to backdoor attacks. A backdoor is inserted into neural networks via this attack paradigm, thus compromising the integrity of the network. As soon as an attacker presents a trigger during the testing phase, the backdoor in the model is activated, allowing the network to make specific wrong predictions. It is extremely important to defend against backdoor attacks since they are very stealthy and dangerous. In this paper, we propose a novel defense mechanism, Neural Behavioral Alignment (NBA), for backdoor removal. NBA optimizes the distillation process in terms of knowledge form and distillation samples to improve defense performance according to the characteristics of backdoor defense. NBA builds high-level representations of neural behavior within networks in order to facilitate the transfer of knowledge. Additionally, NBA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.