Towards Realistic Out-of-Distribution Detection: A Novel Evaluation Framework for Improving Generalization in OOD Detection
Vahid Reza Khazaie, Anthony Wong, Mohammad Sabokrou

TL;DR
This paper introduces a new evaluation framework with realistic datasets and a generalizability score for OOD detection, revealing current models' limitations and proposing a post-processing method to improve performance under distribution shifts.
Contribution
It proposes new realistic OOD test datasets, a generalizability score, and a post-processing method to enhance pre-trained models' robustness in real-world OOD detection scenarios.
Findings
Existing benchmarks do not reflect real-world distribution shifts.
State-of-the-art pre-trained models perform poorly on new realistic datasets.
Post-processing significantly improves model performance under distribution shifts.
Abstract
This paper presents a novel evaluation framework for Out-of-Distribution (OOD) detection that aims to assess the performance of machine learning models in more realistic settings. We observed that the real-world requirements for testing OOD detection methods are not satisfied by the current testing protocols. They usually encourage methods to have a strong bias towards a low level of diversity in normal data. To address this limitation, we propose new OOD test datasets (CIFAR-10-R, CIFAR-100-R, and ImageNet-30-R) that can allow researchers to benchmark OOD detection performance under realistic distribution shifts. Additionally, we introduce a Generalizability Score (GS) to measure the generalization ability of a model during OOD detection. Our experiments demonstrate that improving the performance on existing benchmark datasets does not necessarily improve the usability of OOD detection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Domain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications
MethodsTest
