A Benchmark of Medical Out of Distribution Detection
Tianshi Cao, Chin-Wei Huang, David Yu-Tung Hui, Joseph Paul Cohen

TL;DR
This paper benchmarks various Out-of-Distribution Detection methods across three medical imaging domains, revealing that simple classifiers often outperform complex methods, but all methods struggle with near-distribution images.
Contribution
It introduces a comprehensive benchmark for OoD detection in medical imaging and compares multiple methods across different domains and categories.
Findings
Simple binary classifiers perform best on average.
Most methods fail to detect images close to the training distribution.
Detection accuracy varies significantly across categories.
Abstract
Motivation: Deep learning models deployed for use on medical tasks can be equipped with Out-of-Distribution Detection (OoDD) methods in order to avoid erroneous predictions. However it is unclear which OoDD method should be used in practice. Specific Problem: Systems trained for one particular domain of images cannot be expected to perform accurately on images of a different domain. These images should be flagged by an OoDD method prior to diagnosis. Our approach: This paper defines 3 categories of OoD examples and benchmarks popular OoDD methods in three domains of medical imaging: chest X-ray, fundus imaging, and histology slides. Results: Our experiments show that despite methods yielding good results on some categories of out-of-distribution samples, they fail to recognize images close to the training distribution. Conclusion: We find a simple binary classifier on the feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · AI in cancer detection · Anomaly Detection Techniques and Applications
