Rethinking the Evaluation of Out-of-Distribution Detection: A Sorites Paradox
Xingming Long, Jie Zhang, Shiguang Shan, Xilin Chen

TL;DR
This paper introduces a new benchmark, IS-OOD, for evaluating out-of-distribution detection, emphasizing the nuanced challenge of semantic and covariate shifts and revealing insights into current methods' performance.
Contribution
The paper proposes the IS-OOD benchmark and LAID method to better evaluate OOD detection, addressing the Sorites Paradox in semantic content differentiation.
Findings
Performance improves with increased semantic shift.
Some methods rely less on semantic shifts, indicating different detection mechanisms.
Excessive covariate shifts can be mistaken for OOD by some methods.
Abstract
Most existing out-of-distribution (OOD) detection benchmarks classify samples with novel labels as the OOD data. However, some marginal OOD samples actually have close semantic contents to the in-distribution (ID) sample, which makes determining the OOD sample a Sorites Paradox. In this paper, we construct a benchmark named Incremental Shift OOD (IS-OOD) to address the issue, in which we divide the test samples into subsets with different semantic and covariate shift degrees relative to the ID dataset. The data division is achieved through a shift measuring method based on our proposed Language Aligned Image feature Decomposition (LAID). Moreover, we construct a Synthetic Incremental Shift (Syn-IS) dataset that contains high-quality generated images with more diverse covariate contents to complement the IS-OOD benchmark. We evaluate current OOD detection methods on our benchmark and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsData-Driven Disease Surveillance · Advanced Statistical Process Monitoring
