Robust Out-of-distribution Detection for Neural Networks
Jiefeng Chen, Yixuan Li, Xi Wu, Yingyu Liang, Somesh Jha

TL;DR
This paper addresses the vulnerability of existing out-of-distribution detection methods to adversarial perturbations and introduces ALOE, a robust training algorithm that significantly enhances detection robustness on benchmark datasets.
Contribution
The paper introduces ALOE, a novel adversarial training approach that improves the robustness of OOD detection methods against minimal perturbations.
Findings
ALOE improves AUROC by 58.4% on CIFAR-10
ALOE improves AUROC by 46.59% on CIFAR-100
Existing methods are vulnerable to small adversarial perturbations
Abstract
Detecting out-of-distribution (OOD) inputs is critical for safely deploying deep learning models in the real world. Existing approaches for detecting OOD examples work well when evaluated on benign in-distribution and OOD samples. However, in this paper, we show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs with minimal adversarial perturbations which don't change their semantics. Formally, we extensively study the problem of Robust Out-of-Distribution Detection on common OOD detection approaches, and show that state-of-the-art OOD detectors can be easily fooled by adding small perturbations to the in-distribution and OOD inputs. To counteract these threats, we propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications
