From Chat to Checkup: Can Large Language Models Assist in Diabetes Prediction?
Shadman Sakib, Oishy Fatema Akhand, Ajwad Abrar

TL;DR
This study explores the potential of large language models in predicting diabetes from structured data, comparing their performance with traditional machine learning models across various prompting strategies.
Contribution
It is the first comprehensive empirical analysis of LLMs for structured numerical data in diabetes prediction, highlighting their strengths and limitations.
Findings
Proprietary LLMs outperform open-source models in accuracy.
Gemma-2-27B surpasses traditional ML models in F1-score.
Performance varies with prompting strategies and requires domain-specific fine-tuning.
Abstract
While Machine Learning (ML) and Deep Learning (DL) models have been widely used for diabetes prediction, the use of Large Language Models (LLMs) for structured numerical data is still not well explored. In this study, we test the effectiveness of LLMs in predicting diabetes using zero-shot, one-shot, and three-shot prompting methods. We conduct an empirical analysis using the Pima Indian Diabetes Database (PIDD). We evaluate six LLMs, including four open-source models: Gemma-2-27B, Mistral-7B, Llama-3.1-8B, and Llama-3.2-2B. We also test two proprietary models: GPT-4o and Gemini Flash 2.0. In addition, we compare their performance with three traditional machine learning models: Random Forest, Logistic Regression, and Support Vector Machine (SVM). We use accuracy, precision, recall, and F1-score as evaluation metrics. Our results show that proprietary LLMs perform better than open-source…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Artificial Intelligence in Healthcare
