Interactive Diabetes Risk Prediction Using Explainable Machine Learning: A Dash-Based Approach with SHAP, LIME, and Comorbidity Insights
Udaya Allani

TL;DR
This paper introduces an interactive web tool for diabetes risk prediction that leverages explainable machine learning models, particularly LightGBM with undersampling, and provides personalized insights using SHAP, LIME, and comorbidity analysis.
Contribution
It develops a user-friendly Dash-based platform integrating multiple ML models with explainability tools and comorbidity insights for improved diabetes risk assessment.
Findings
LightGBM with undersampling achieved highest recall.
The tool effectively explains predictions using SHAP and LIME.
Comorbidity analysis reveals significant correlations.
Abstract
This study presents a web-based interactive health risk prediction tool designed to assess diabetes risk using machine learning models. Built on the 2015 CDC BRFSS dataset, the study evaluates models including Logistic Regression, Random Forest, XGBoost, LightGBM, KNN, and Neural Networks under original, SMOTE, and undersampling strategies. LightGBM with undersampling achieved the best recall, making it ideal for risk detection. The tool integrates SHAP and LIME to explain predictions and highlights comorbidity correlations using Pearson analysis. A Dash-based UI enables user-friendly interaction with model predictions, personalized suggestions, and feature insights, supporting data-driven health awareness.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare · Machine Learning in Healthcare · Explainable Artificial Intelligence (XAI)
MethodsLocal Interpretable Model-Agnostic Explanations · Logistic Regression · Shapley Additive Explanations · Synthetic Minority Over-sampling Technique.
