Glucose-ML: A collection of longitudinal diabetes datasets for development of robust AI solutions
Temiloluwa Prioleau, Baiying Lu, Yanjun Cui

TL;DR
Glucose-ML is a comprehensive collection of 10 publicly available diabetes datasets with over 300,000 days of CGM data, designed to facilitate the development of robust, transparent, and reproducible AI solutions for diabetes management.
Contribution
The paper introduces Glucose-ML, a new dataset collection, along with a comparative analysis and benchmark for blood glucose prediction, addressing data accessibility and variability issues in AI for diabetes.
Findings
Different datasets yield significantly different prediction results.
The same algorithm's performance varies across datasets.
Recommendations for robust AI development are provided.
Abstract
Artificial intelligence (AI) algorithms are a critical part of state-of-the-art digital health technology for diabetes management. Yet, access to large high-quality datasets is creating barriers that impede development of robust AI solutions. To accelerate development of transparent, reproducible, and robust AI solutions, we present Glucose-ML, a collection of 10 publicly available diabetes datasets, released within the last 7 years (i.e., 2018 - 2025). The Glucose-ML collection comprises over 300,000 days of continuous glucose monitor (CGM) data with a total of 38 million glucose samples collected from 2500+ people across 4 countries. Participants include persons living with type 1 diabetes, type 2 diabetes, prediabetes, and no diabetes. To support researchers and innovators with using this rich collection of diabetes datasets, we present a comparative analysis to guide algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDiabetes Management and Research · Hyperglycemia and glycemic control in critically ill and hospitalized patients · Artificial Intelligence in Healthcare
