Deep Probabilistic Models to Detect Data Poisoning Attacks
Mahesh Subedar, Nilesh Ahuja, Ranganath Krishnan, Ibrahima J. Ndiour,, Omesh Tickoo

TL;DR
This paper explores methods to detect data poisoning in deep neural networks by analyzing uncertainty estimates derived from deep features and Bayesian inference, effectively identifying malicious training samples.
Contribution
It introduces two novel approaches for detecting backdoor data poisoning attacks using uncertainty quantification in deep neural networks.
Findings
Uncertainty estimates can effectively distinguish poisoned from clean samples.
Modeling deep features with probability distributions aids in poisoning detection.
Bayesian neural networks provide reliable uncertainty measures for identifying malicious data.
Abstract
Data poisoning attacks compromise the integrity of machine-learning models by introducing malicious training samples to influence the results during test time. In this work, we investigate backdoor data poisoning attack on deep neural networks (DNNs) by inserting a backdoor pattern in the training images. The resulting attack will misclassify poisoned test samples while maintaining high accuracies for the clean test-set. We present two approaches for detection of such poisoned samples by quantifying the uncertainty estimates associated with the trained models. In the first approach, we model the outputs of the various layers (deep features) with parametric probability distributions learnt from the clean held-out dataset. At inference, the likelihoods of deep features w.r.t these distributions are calculated to derive uncertainty estimates. In the second approach, we use Bayesian deep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications
MethodsTest
