Anti-Backdoor Learning: Training Clean Models on Poisoned Data

Yige Li; Xixiang Lyu; Nodens Koren; Lingjuan Lyu; Bo Li; Xingjun Ma

arXiv:2110.11571·cs.LG·December 2, 2021·35 cites

Anti-Backdoor Learning: Training Clean Models on Poisoned Data

Yige Li, Xixiang Lyu, Nodens Koren, Lingjuan Lyu, Bo Li, Xingjun Ma

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Anti-Backdoor Learning (ABL), a training scheme designed to prevent backdoor triggers from being embedded into neural networks trained on poisoned data, by exploiting inherent attack weaknesses.

Contribution

The paper proposes a novel two-stage gradient ascent method for training models that inherently resist backdoor injection during learning.

Findings

01

ABL achieves comparable performance to clean data training on poisoned datasets.

02

The method effectively isolates backdoor examples early in training.

03

ABL outperforms existing defenses against multiple backdoor attack methods.

Abstract

Backdoor attack has emerged as a major security threat to deep neural networks (DNNs). While existing defense methods have demonstrated promising results on detecting or erasing backdoors, it is still not clear whether robust training methods can be devised to prevent the backdoor triggers being injected into the trained model in the first place. In this paper, we introduce the concept of \emph{anti-backdoor learning}, aiming to train \emph{clean} models given backdoor-poisoned data. We frame the overall learning process as a dual-task of learning the \emph{clean} and the \emph{backdoor} portions of data. From this view, we identify two inherent characteristics of backdoor attacks as their weaknesses: 1) the models learn backdoored data much faster than learning with clean data, and the stronger the attack the faster the model converges on backdoored data; 2) the backdoor task is tied…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bboylyg/abl
pytorchOfficial

Videos

Anti-Backdoor Learning: Training Clean Models on Poisoned Data· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning