Comparative Performance of Machine Learning Algorithms for Early Genetic   Disorder and Subclass Classification

Abu Bakar Siddik; Faisal R. Badal; Afroza Islam

arXiv:2412.02189·cs.AI·December 4, 2024

Comparative Performance of Machine Learning Algorithms for Early Genetic Disorder and Subclass Classification

Abu Bakar Siddik, Faisal R. Badal, Afroza Islam

PDF

Open Access

TL;DR

This study evaluates machine learning algorithms on clinical data to classify genetic disorders early in life, achieving up to 80% accuracy, and highlights the potential for timely diagnosis and intervention.

Contribution

It introduces a machine learning approach using basic clinical indicators for early genetic disorder classification, with optimized models for disorder types and subtypes.

Findings

01

CatBoost achieved 77% accuracy for disorder classes

02

SVM attained 80% accuracy for disorder subtypes

03

Models demonstrate feasibility for early diagnosis using simple clinical data

Abstract

A great deal of effort has been devoted to discovering a particular genetic disorder, but its classification across a broad spectrum of disorder classes and types remains elusive. Early diagnosis of genetic disorders enables timely interventions and improves outcomes. This study implements machine learning models using basic clinical indicators measurable at birth or infancy to enable diagnosis in preliminary life stages. Supervised learning algorithms were implemented on a dataset of 22083 instances with 42 features like family history, newborn metrics, and basic lab tests. Extensive hyperparameter tuning, feature engineering, and selection were undertaken. Two multi-class classifiers were developed: one for predicting disorder classes (mitochondrial, multifactorial, and single-gene) and one for subtypes (9 disorders). Performance was evaluated using accuracy, precision, recall, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare

MethodsSupport Vector Machine