DOS: Diverse Outlier Sampling for Out-of-Distribution Detection
Wenyu Jiang, Hao Cheng, Mingcai Chen, Chongjun Wang, Hongxin Wei

TL;DR
This paper introduces DOS, a novel outlier sampling method that emphasizes diversity to improve out-of-distribution detection in neural networks, outperforming existing approaches by effectively capturing the full outlier distribution.
Contribution
The paper proposes a simple yet effective clustering-based sampling strategy called DOS that enhances OOD detection by ensuring diverse outlier selection.
Findings
DOS reduces FPR95 by up to 25.79% on CIFAR-100 with TI-300K.
Diversity in outlier sampling improves the decision boundary between ID and OOD data.
Sampling from diverse clusters leads to better OOD detection performance.
Abstract
Modern neural networks are known to give overconfident prediction for out-of-distribution inputs when deployed in the open world. It is common practice to leverage a surrogate outlier dataset to regularize the model during training, and recent studies emphasize the role of uncertainty in designing the sampling strategy for outlier dataset. However, the OOD samples selected solely based on predictive uncertainty can be biased towards certain types, which may fail to capture the full outlier distribution. In this work, we empirically show that diversity is critical in sampling outliers for OOD detection performance. Motivated by the observation, we propose a straightforward and novel sampling strategy named DOS (Diverse Outlier Sampling) to select diverse and informative outliers. Specifically, we cluster the normalized features at each iteration, and the most informative outlier from…
Peer Reviews
Decision·ICLR 2024 poster
- The paper is easy to follow. - The basic idea of the proposed approach is very simple and elegant. - The experimental results are compelling.
- By normalizing the feature vectors, if I understand correctly, Euclidean norm is used so that the normalized vector resides on the unit hypersphere. Is there any benefit to using specialized versions of k-means (and k-means related) algorithms for the hypersphere? For reference, there are versions of k-means and the Gaussian mixture model that are restricted to the hypersphere (technically, mixtures of von Mises-Fisher distributions). See, for instance, the papers by Banerjee et al (2005) and
1. This paper focuses on an important and practical question on outlier exposure, i.e., the auxiliary outliers may fail to capture the full outlier distribution. 2. This paper proposes a new method, namely DOS, which clusters the normalized features at each iteration and samples the informative outlier from each cluster to realize the diversified outlier selection. The technical design for clustering with normalized features is noval to the knowledge of the reviewer and shows promising empiric
Overall, this work presents a concise and effective way to conduct diverse sampling in outlier exposure. Here are the major concerns for the current version of this paper, and hope it can help to improve the paper better. 1. Although the overall presentation is clear, some critical definitions and claims are questionable and lack of convincing support. 2. Technically, the proposed method (DOS) is based on an empirical demonstration of "diversity" with the OOD detection performance. However, the
1. The paper studies a well-motivated problem in OOD detection. Finding high-quality samples in the large auxiliary dataset improves the performance of the trained model and reduces the training cost. 2. The proposed diverse sampling is simple and effective. 3. The proposed method achieves impressive empirical results on the common benchmark. 4. The authors provide extensive ablation studies to demonstrate the robustness of the method.
1. Diverse sampling is a well-known method in active learning [1,2]. Diverse sampling leading to superior performance is not surprising. It indeed improves the performance of OOD models. Critically speaking, the novelty in terms of the method is limited. If the authors could provide some deeper theoretical analysis, e.g., how diverse sampling improves the generalization bound, it would make the paper more solid. 2. There are typos in equation 2. I understand that the authors use a cross-entropy
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification
Methodsfail
