Out-of-distribution Reject Option Method for Dataset Shift Problem in Early Disease Onset Prediction
Taisei Tosaki, Eiichiro Uchino, Ryosuke Kojima, Yohei Mineharu, Yuji Okamoto, Mikio Arita, Nobuyuki Miyai, Yoshinori Tamada, Tatsuya Mikami, Koichi Murashita, Shigeyuki Nakaji, Yasushi Okuno

TL;DR
This paper introduces ODROP, an out-of-distribution reject option method that enhances disease onset prediction accuracy under dataset shift by integrating OOD detection, validated on real-world health datasets with significant improvements.
Contribution
The study proposes a novel OOD reject option method for disease prediction, applying it to health data and demonstrating improved accuracy under dataset shift conditions.
Findings
Variational autoencoder outperformed other OOD detection methods.
AUROC for diabetes prediction increased from 0.80 to 0.90 with ODROP.
ODROP significantly reduces misclassification due to dataset shift.
Abstract
Machine learning is increasingly used to predict lifestyle-related disease onset using health and medical data. However, its predictive accuracy for use is often hindered by dataset shift, which refers to discrepancies in data distribution between the training and testing datasets. This issue leads to the misclassification of out-of-distribution (OOD) data. To diminish dataset shift in real-world settings, this paper proposes the out-of-distribution reject option for prediction (ODROP). This method integrates an OOD detection model to preclude OOD data from the prediction phase. We used two real-world health checkup datasets (Hirosaki and Wakayama) with dataset shift, across three disease onset prediction tasks: diabetes, dyslipidemia, and hypertension. Both components of ODROP method -- the OOD detection model and the prediction model -- were trained on the Hirosaki dataset. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare
MethodsShapley Additive Explanations
