Efficient Backdoor Removal Through Natural Gradient Fine-tuning

Nazmul Karim; Abdullah Al Arafat; Umar Khalid; Zhishan Guo; Naznin; Rahnavard

arXiv:2306.17441·cs.CV·July 3, 2023·2 cites

Efficient Backdoor Removal Through Natural Gradient Fine-tuning

Nazmul Karim, Abdullah Al Arafat, Umar Khalid, Zhishan Guo, Naznin, Rahnavard

PDF

Open Access 1 Repo

TL;DR

This paper introduces Natural Gradient Fine-tuning (NGF), a novel method for backdoor removal in neural networks that fine-tunes only one layer using a geometry-aware optimizer and a regularizer, achieving state-of-the-art results.

Contribution

The paper proposes NGF, a backdoor purification technique that fine-tunes a single layer with a geometry-aware optimizer and a regularizer based on Fisher Information, reducing computational costs and improving performance.

Findings

01

NGF effectively removes backdoors across multiple datasets and attacks.

02

Achieves state-of-the-art backdoor defense performance.

03

Reduces computational costs by fine-tuning only one layer.

Abstract

The success of a deep neural network (DNN) heavily relies on the details of the training scheme; e.g., training data, architectures, hyper-parameters, etc. Recent backdoor attacks suggest that an adversary can take advantage of such training details and compromise the integrity of a DNN. Our studies show that a backdoor model is usually optimized to a bad local minima, i.e. sharper minima as compared to a benign model. Intuitively, a backdoor model can be purified by reoptimizing the model to a smoother minima through fine-tuning with a few clean validation data. However, fine-tuning all DNN parameters often requires huge computational costs and often results in sub-par clean test performance. To address this concern, we propose a novel backdoor purification technique, Natural Gradient Fine-tuning (NGF), which focuses on removing the backdoor by fine-tuning only one layer. Specifically,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nazmul-karim170/natural-gradient-finetuning-trojan-defense
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications