Multiclass Disease Predictions Based on Integrated Clinical and Genomics Datasets
Moeez M. Subhani, Ashiq Anjum

TL;DR
This study develops a machine learning model that integrates clinical and genomics data to predict multiple diseases, demonstrating 73% accuracy and highlighting the value of genomics in clinical diagnostics and precision medicine.
Contribution
The paper introduces a novel integrated dataset combining clinical and genomics data for multiclass disease prediction using machine learning, with a high number of classes and effective feature selection.
Findings
Achieved 73% prediction accuracy on integrated data
Demonstrated reliable inclusion of genomics in clinical predictions
Validated effectiveness of PCA for feature selection
Abstract
Clinical predictions using clinical data by computational methods are common in bioinformatics. However, clinical predictions using information from genomics datasets as well is not a frequently observed phenomenon in research. Precision medicine research requires information from all available datasets to provide intelligent clinical solutions. In this paper, we have attempted to create a prediction model which uses information from both clinical and genomics datasets. We have demonstrated multiclass disease predictions based on combined clinical and genomics datasets using machine learning methods. We have created an integrated dataset, using a clinical (ClinVar) and a genomics (gene expression) dataset, and trained it using instance-based learner to predict clinical diseases. We have used an innovative but simple way for multiclass classification, where the number of output classes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Genetics, Bioinformatics, and Biomedical Research · Bioinformatics and Genomic Networks
