FLARE: Toward Universal Dataset Purification against Backdoor Attacks
Linshan Hou, Wei Luo, Zhongyun Hua, Songhua Chen, Leo Yu Zhang, Yiming Li

TL;DR
FLARE is a universal dataset purification method that effectively detects and removes backdoor poisoning in neural networks by aggregating layer activations and adaptively clustering data, outperforming existing techniques.
Contribution
This paper introduces FLARE, a novel purification approach that aggregates multi-layer activations and adaptively selects subspaces for robust backdoor detection across various attack types.
Findings
FLARE outperforms existing methods on multiple benchmark datasets.
Effective against 22 different backdoor attack methods.
Robust to adaptive attack strategies.
Abstract
Deep neural networks (DNNs) are susceptible to backdoor attacks, where adversaries poison datasets with adversary-specified triggers to implant hidden backdoors, enabling malicious manipulation of model predictions. Dataset purification serves as a proactive defense by removing malicious training samples to prevent backdoor injection at its source. We first reveal that the current advanced purification methods rely on a latent assumption that the backdoor connections between triggers and target labels in backdoor attacks are simpler to learn than the benign features. We demonstrate that this assumption, however, does not always hold, especially in all-to-all (A2A) and untargeted (UT) attacks. As a result, purification methods that analyze the separation between the poisoned and benign samples in the input-output space or the final hidden layer space are less effective. We observe that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Anomaly Detection Techniques and Applications
