Anomaly Detection of Adversarial Examples using Class-conditional Generative Adversarial Networks
Hang Wang, David J. Miller, George Kesidis

TL;DR
This paper introduces an unsupervised anomaly detection method for adversarial examples in DNNs using class-conditional GANs, outperforming previous methods across various attack scenarios.
Contribution
The paper presents a novel detection approach leveraging class-conditional GANs to identify adversarial examples without supervision, improving detection accuracy.
Findings
Outperforms previous detection methods on multiple datasets
Detection effectiveness varies with features from different DNN layers
Anomalies are harder to detect using features closer to the output layer
Abstract
Deep Neural Networks (DNNs) have been shown vulnerable to Test-Time Evasion attacks (TTEs, or adversarial examples), which, by making small changes to the input, alter the DNN's decision. We propose an unsupervised attack detector on DNN classifiers based on class-conditional Generative Adversarial Networks (GANs). We model the distribution of clean data conditioned on the predicted class label by an Auxiliary Classifier GAN (AC-GAN). Given a test sample and its predicted class, three detection statistics are calculated based on the AC-GAN Generator and Discriminator. Experiments on image classification datasets under various TTE attacks show that our method outperforms previous detection methods. We also investigate the effectiveness of anomaly detection using different DNN layers (input features or internal-layer features) and demonstrate, as one might expect, that anomalies are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Malware Detection Techniques
MethodsAuxiliary Classifier
