Proof-of-Concept Machine Learning Framework for Arboviral Disease Classification Using Literature-Derived Synthetic Data: Methodological Development Preceding Clinical Validation
Elí Cruz-Parada, Guillermina Vivar-Estudillo, Laura Pérez-Campos Mayoral, María Teresa Hernández-Huerta, Alma Dolores Pérez-Santiago, Carlos Romero-Diaz, Eduardo Pérez-Campos Mayoral, Iván A. García Montalvo, Lucia Martínez-Martínez, Héctor Martínez-Ruiz, Idarh Matadamas

TL;DR
A machine learning framework was developed to classify arboviral diseases using synthetic data, showing strong performance in distinguishing diseases like Dengue and Influenza.
Contribution
A novel proof-of-concept ML framework using synthetic data for arboviral disease classification is proposed and validated.
Findings
The synthetic dataset aligns with PAHO guidelines and mirrors real-world arboviral databases.
The Narrow Neural Network model achieved high accuracy (0.92) and AUC (above 0.98) in classifying arboviral diseases.
The model reliably distinguishes Dengue from Influenza but shows slightly lower performance between Zika and Chikungunya.
Abstract
What are the main findings? Extraction and selection of features from 67 symptoms using binary coding.Model of classification for arboviral diseases using different methods based on machine learning and deep learning. Extraction and selection of features from 67 symptoms using binary coding. Model of classification for arboviral diseases using different methods based on machine learning and deep learning. What are the implications of the main findings? Conducts rigorous statistical analysis of data to identify symptoms more prevalent for different arboviral diseases using Odds Ratio and Chi-square.Performance evaluation using metrics such as F1-score, accuracy, precision, sensitivity, specificity, AUC-ROC, and Cohen’s kappa. Conducts rigorous statistical analysis of data to identify symptoms more prevalent for different arboviral diseases using Odds Ratio and Chi-square.…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMosquito-borne diseases and control · Data-Driven Disease Surveillance · Machine Learning in Healthcare
