Learning To Characterize Adversarial Subspaces
Xiaofeng Mao, Yuefeng Chen, Yuhong Li, Yuan He, Hui Xue

TL;DR
This paper introduces a novel adversarial detection method that adaptively learns metrics to characterize adversarial subspaces using neighbor context, significantly improving detection accuracy across various attack scenarios.
Contribution
The paper proposes the Neighbor Context Encoder (NCE), a new model that learns from neighbor context to better detect adversarial examples, overcoming limitations of previous fixed-metric approaches.
Findings
Outperforms existing methods in attack-aware black-box detection
Effective in attack-unaware black-box detection
Achieves superior results in white-box detection
Abstract
Deep Neural Networks (DNNs) are known to be vulnerable to the maliciously generated adversarial examples. To detect these adversarial examples, previous methods use artificially designed metrics to characterize the properties of \textit{adversarial subspaces} where adversarial examples lie. However, we find these methods are not working in practical attack detection scenarios. Because the artificially defined features are lack of robustness and show limitation in discriminative power to detect strong attacks. To solve this problem, we propose a novel adversarial detection method which identifies adversaries by adaptively learning reasonable metrics to characterize adversarial subspaces. As auxiliary context information, \textit{k} nearest neighbors are used to represent the surrounded subspace of the detected sample. We propose an innovative model called Neighbor Context Encoder (NCE)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Malware Detection Techniques
