Unmasking the Chameleons: A Benchmark for Out-of-Distribution Detection in Medical Tabular Data
Mohammad Azizmalayeri, Ameen Abu-Hanna, Giovanni Cin\'a

TL;DR
This paper introduces a comprehensive benchmark for out-of-distribution detection in medical tabular data, revealing that current methods are effective for far-OODs but struggle with near-OODs, especially in simpler architectures.
Contribution
It provides the first extensive, reproducible benchmark comparing OOD detection methods across multiple architectures and datasets in medical tabular data.
Findings
Far-OOD detection is effective with current methods.
Post-hoc methods improve when combined with distance-based approaches.
Transformers are less overconfident than MLP and ResNet.
Abstract
Despite their success, Machine Learning (ML) models do not generalize effectively to data not originating from the training distribution. To reliably employ ML models in real-world healthcare systems and avoid inaccurate predictions on out-of-distribution (OOD) data, it is crucial to detect OOD samples. Numerous OOD detection approaches have been suggested in other fields - especially in computer vision - but it remains unclear whether the challenge is resolved when dealing with medical tabular data. To answer this pressing need, we propose an extensive reproducible benchmark to compare different methods across a suite of tests including both near and far OODs. Our benchmark leverages the latest versions of eICU and MIMIC-IV, two public datasets encompassing tens of thousands of ICU patients in several hospitals. We consider a wide array of density-based methods and SOTA post-hoc…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Frailty in Older Adults · Colorectal Cancer Screening and Detection
MethodsMulti-Head Attention · Attention Is All You Need · Average Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Residual Block · Linear Layer · Global Average Pooling · Max Pooling · Kaiming Initialization · Batch Normalization
