# P-2006. Development of a Machine Learning Model for the Accurate Diagnosis of Common Tropical Febrile Illnesses

**Authors:** C Shravya, R Rajalakshmi, Muhammed Rashid, Girish Thunga, Vijayanarayana Kunhikatta, Muralidhar Varma, Vasudha Devi, Raviraj V Acharya, K N Shivshankar, Ashwini Amin, Dinesh Acharya U, Sohil khan

PMC · DOI: 10.1093/ofid/ofaf695.2170 · Open Forum Infectious Diseases · 2026-01-11

## TL;DR

This study develops an AI-based tool to improve the diagnosis of tropical febrile illnesses like dengue and malaria by using machine learning models trained on clinical data.

## Contribution

The novel contribution is the development of an AI diagnostic tool using feature selection and stacking classifiers to improve accuracy in tropical febrile illness diagnosis.

## Key findings

- A stacking classifier achieved 89% accuracy in diagnosing tropical febrile illnesses.
- Random Forest showed the highest individual model accuracy at 87%.
- Twenty non-collinear clinical features were selected for model development.

## Abstract

Acute febrile illnesses (AFIs) such as dengue, malaria, scrub typhus, leptospirosis etc, prevalent in tropical regions, account for 17% of the global disease burden. Overlapping clinical features and limitations of current diagnostic methods—including false positives, sensitivity/specificity issues made the diagnosis complicated. With increasing digital integration in healthcare, artificial intelligence (AI) offers a promising solution for improving diagnostic accuracy and efficiency. Our study aimed to develop an AI-based tool to aid differential diagnosis of AFIs and support clinical decision-making.Fig 1:Features selected through Recursive Elimination and MulticollinearityBased on Recursive_Feature_Elimination (RFE), a set of high-ranking features was initially identified. These features were then subjected to multicollinearity assessment to eliminate redundant variables with strong linear relationships. The final set of 20 non-collinear, clinically relevant features was selected for model development and is presented in Fig_1Fig 2:Accuracy of the developed models

Features selected through Recursive Elimination and Multicollinearity

Based on Recursive_Feature_Elimination (RFE), a set of high-ranking features was initially identified. These features were then subjected to multicollinearity assessment to eliminate redundant variables with strong linear relationships. The final set of 20 non-collinear, clinically relevant features was selected for model development and is presented in Fig_1

Accuracy of the developed models

A retrospective cross-sectional study analyzed records of 800 patients (200 per disease). Clinical data were extracted and preprocessed (cleaning, scaling, imputation). Feature selection was conducted using Recursive_Feature_Elimination (RFE) and multicollinearity assessment. Models were developed using supervised machine learning algorithms— Random_Forest(RF), Naïve_Bayes(NB), Logistic_Regression(LR), Support_Vector_Machine(SVM), K-Nearest_Neighbors(KNN), and Decision_Tree(DT). A stacking classifier served as a meta-model. Performance was evaluated using accuracy, precision, recall, and F1-score.Fig 3:Confusion matrix for Stacking ClassifierTable 1:Other performance Metrics

Confusion matrix for Stacking Classifier

Other performance Metrics

Based on RFE, set of high-ranking features were identified and were then subjected to multicollinearity assessment. The final set of 20 non-collinear, clinically relevant features was selected for model development (presented in Fig_1). Among the developed predictive models, the RF demonstrated highest classification accuracy (87%), followed by LR (86.5%), SVM (85.5%), KNN (73%), DT (72.5%), and NB (70.5%), (shown in Fig_2). Additional performance metrics, including precision, recall, and F1-score, are summarized in Table 1. A stacking classifier, integrating the predictions of all base models, achieved an overall accuracy of 89% on training dataset. The confusion matrix for the stacking model is presented in Fig_3

By integrating clinical data with advanced feature selection techniques, AI-based tool can be the future diagnostic aiding tool for screening the tropical diseases.

All Authors: No reported disclosures

## Linked entities

- **Diseases:** dengue (MONDO:0005502), malaria (MONDO:0005136), scrub typhus (MONDO:0019365), leptospirosis (MONDO:0005825)

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12793441/full.md

---
Source: https://tomesphere.com/paper/PMC12793441