Machine Learning Driven Biomarker Selection for Medical Diagnosis
Divyagna Bavikadi, Ayushi Agarwal, Shashank Ganta, Yunro Chung,, Lusheng Song, Ji Qiu, Paulo Shakarian

TL;DR
This study compares multiple biomarker selection methods and machine learning classifiers to improve disease diagnosis accuracy using fewer biomarkers, demonstrating that advanced methods outperform traditional logistic regression especially with limited biomarkers.
Contribution
It evaluates 16 combinations of biomarker selection and ML classifiers, identifying the most effective approaches for different biomarker count constraints.
Findings
Contemporary methods outperform logistic regression with 3 and 10 biomarkers.
ML approaches achieve higher sensitivity at fixed specificity compared to logistic regression.
Causal-based methods excel with fewer biomarkers, univariate methods with more.
Abstract
Recent advances in experimental methods have enabled researchers to collect data on thousands of analytes simultaneously. This has led to correlational studies that associated molecular measurements with diseases such as Alzheimer's, Liver, and Gastric Cancer. However, the use of thousands of biomarkers selected from the analytes is not practical for real-world medical diagnosis and is likely undesirable due to potentially formed spurious correlations. In this study, we evaluate 4 different methods for biomarker selection and 4 different machine learning (ML) classifiers for identifying correlations, evaluating 16 approaches in all. We found that contemporary methods outperform previously reported logistic regression in cases where 3 and 10 biomarkers are permitted. When specificity is fixed at 0.9, ML approaches produced a sensitivity of 0.240 (3 biomarkers) and 0.520 (10 biomarkers),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare · Brain Tumor Detection and Classification · Genetics, Bioinformatics, and Biomedical Research
MethodsFeature Selection · Logistic Regression
