Deploying machine learning models in clinical settings: a real-world feasibility analysis for a model identifying adult-onset type 1 diabetes initially classified as type 2
Irene Brusini, Suyin Lee, Jacob Hollingsworth, Amanda Sees, Matthew Hackenberg, Harm Scherpbier, Raquel López-Díez, Nadejda Leavitt

TL;DR
This study tests how well a machine learning model can identify adult-onset type 1 diabetes misclassified as type 2 in real-world health data and evaluates the challenges of deploying such models in clinical settings.
Contribution
The first real-world evaluation of a machine learning model for identifying misclassified type 1 diabetes using health information exchange data.
Findings
The national model performed well on HIE data (AUROC = 0.751; PR5 = 25.5%), and localization improved performance further (AUROC = 0.774; PR5 = 35.4%).
Adjustments for HIE data compatibility revealed discrepancies in model predictors and highlighted the importance of aligning algorithm design with deployment needs.
Data inconsistencies across HIE member organizations could undermine model accuracy and provider trust in ML tools.
Abstract
This study evaluates the performance and deployment feasibility of a machine learning (ML) model to identify adult-onset type 1 diabetes (T1D) initially coded as type 2 on electronic medical records (EMRs) from a health information exchange (HIE). To our knowledge, this is the first evaluation of such a model on real-world HIE data. An existing ML model, trained on national US EMR data, was tested on a regional HIE dataset, after several adjustments for compatibility. A localized model retrained on the regional dataset was compared to the national model. Discrepancies between the 2 datasets’ features and cohorts were also investigated. The national model performed well on HIE data (AUROC = 0.751; precision at 5% recall [PR5] = 25.5%), and localization further improved performance (AUROC = 0.774; PR5 = 35.4%). Differences in the 2 models’ top predictors reflected the discrepancies…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDiabetes Management and Research · Diabetes and associated disorders · Artificial Intelligence in Healthcare
