# Characterizing patients who benefit from mature medical AI models in real-world clinical applications

**Authors:** Zhiyi Chen, Wei Li, Zhicheng Lin

PMC · DOI: 10.1371/journal.pdig.0001283 · 2026-03-20

## TL;DR

Medical AI performs well in wealthy countries and for well-represented groups but fails to show benefits in diverse or underrepresented populations, highlighting a digital divide.

## Contribution

Systematic analysis of real-world AI deployment reveals performance disparities and demographic biases in medical AI effectiveness.

## Key findings

- AI models outperformed human practitioners in in-distribution settings but not in out-of-distribution deployments.
- 95.1% of AI patient cohorts were from high- or upper-middle-income countries, with no low-income country representation.
- Performance advantages of AI disappeared in underrepresented populations and cross-demographic contexts.

## Abstract

Medical artificial intelligence (AI) is being rapidly deployed in clinical practice, yet its real-world effectiveness across diverse patient populations remains poorly characterized. We conducted a systematic review combining automated screening (fine-tuned BERT-PubMed classifiers) with manual validation to identify studies of mature medical AI models deployed in healthcare facilities worldwide. We included 171 studies at the “device-into-practice” stage with sufficient demographic and performance data, representing 209,772 patients. Patient access to these models showed marked demographic disparities: geographic concentration was extreme (Dagum–Gini coefficient 0.97, P < .001), with 95.1% of patient cohorts (studies) from high-income (62.2%) or upper-middle-income (32.9%) countries—primarily China (28.7%) and the United States (18.9%)—and no studies from low-income countries. Racial representation was dominated by White (49.1%) and Asian (42.6%) patients, and 63.8% of studies exhibited moderate-to-high sex imbalance. Across all studies, AI models outperformed human practitioners (81.7% vs. 77.8% accuracy, P < .001), but this superiority was confined to in-distribution applications (same geographic/demographic context: 82.9% vs. 77.3%, P < .001) and disappeared in out-of-distribution deployments (cross-geographic/demographic contexts: 74.1% vs. 76.3%, P = .45). In underrepresented populations, AI performance was not significantly different from that of human practitioners. Overall, mature medical AI models are deployed predominantly in economically advantaged settings, with performance advantages concentrated in well-represented demographic groups, highlighting a digital divide in access and effectiveness, and the need for demographic-specific validation.

Artificial intelligence (AI) is rapidly transforming healthcare, but it remains unclear who actually benefits from these technologies in everyday clinical practice. We systematically reviewed 171 studies involving over 200,000 real-world patients and found that access to mature medical AI models is heavily skewed: 95.1% of patient cohorts were from high- or upper-middle-income countries, primarily China and the United States, with no representation from low-income countries. Most evaluations focused on White (49.1%) and Asian (42.6%) patients, and nearly two-thirds of studies showed substantial sex imbalances. Across studies, AI models generally outperform human practitioners on diagnostic and prognostic tasks, but this advantage was seen only when systems were used in populations similar to those in which they were developed. When deployed in different regions or demographic groups, AI performance decreased and was no better than that of human clinicians, while typically remaining comparable rather than clearly worse. These patterns challenge the idea that medical AI systems are universally applicable and reveal a digital divide in who benefits from them. Our findings highlight the need for more inclusive AI development, routine monitoring of performance after deployment in diverse populations, and, where appropriate, region-specific adaptation of models so that AI can support patients more equitably across different settings.

## Full-text entities

- **Diseases:** cardiovascular diseases (MESH:D002318), psychiatric (MESH:D001523), AI (MESH:C538142), breast and gastrointestinal cancers (MESH:D001943), fatigue (MESH:D005221), sepsis (MESH:D018805), cancer (MESH:D009369), autoimmune conditions (MESH:D001327)
- **Chemicals:** BERT (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13004356/full.md

---
Source: https://tomesphere.com/paper/PMC13004356