AntidoteRT: Run-time Detection and Correction of Poison Attacks on Neural Networks
Muhammad Usman, Youcheng Sun, Divya Gopinath, Corina S. Pasareanu

TL;DR
AntidoteRT introduces a lightweight, run-time method for detecting and correcting backdoor poisoning attacks in neural networks, using neuron pattern analysis and input correction to improve security without retraining.
Contribution
It presents a novel run-time detection and correction framework that operates at the input level, differing from traditional offline model defenses.
Findings
Outperforms NeuralCleanse and STRIP on benchmark datasets.
Effective against BadNets and DFST poisoning attacks.
Works in real-time without retraining.
Abstract
We study backdoor poisoning attacks against image classification networks, whereby an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger causes the classifier to predict some target class. %There are several techniques proposed in the literature that aim to detect the attack but only a few also propose to defend against it, and they typically involve retraining the network which is not always possible in practice. We propose lightweight automated detection and correction techniques against poisoning attacks, which are based on neuron patterns mined from the network using a small set of clean and poisoned test samples with known labels. The patterns built based on the mis-classified samples are used for run-time detection of new poisoned inputs. For correction, we propose an input correction technique that uses a differential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
