Binary disease prediction using tail quantiles of the distribution of continuous biomarkers
Michiel H.J. Paus, Edwin R. van den Heuvel, Marc J.M. Meddens

TL;DR
This paper introduces quantile-based prediction (QBP), a novel binary disease classification method that leverages distribution tails of biomarkers to improve detection, especially when variance differences are prominent.
Contribution
The study proposes QBP, a new biomarker selection and classification approach based on distribution tails, demonstrating its advantages over traditional mean-based methods in heterogeneous biological data.
Findings
QBP outperforms other methods when biomarkers show variance differences.
QBP is effective in selecting relevant biomarkers.
QBP performs well in case studies of depression and trisomy.
Abstract
In the analysis of binary disease classification, single biomarkers might not have significant discriminating power and multiple biomarkers from a large set of biomarkers should be selected. Numerous approaches exist, but they merely work well for mean differences in biomarkers between cases and controls. Biological processes are however much more heterogeneous, and differences could also occur in other distributional characteristics (e.g. variances, skewness). Many machine learning techniques are better capable of utilizing these higher order distributional differences, sometimes at cost of explainability. In this study we propose quantile based prediction (QBP), a binary classification method that is based on the selection of multiple continuous biomarkers. QBP generates a single score using the tails of the biomarker distributions for cases and controls. This single score can then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Genetic Associations and Epidemiology · Bioinformatics and Genomic Networks
