# Large Language Models Using Clinical Text in Pediatrics: A Scoping Review

**Authors:** Tracy Huang, Gabriel Tse, Natalie M. Pageler, Yair Bannett

PMC · DOI: 10.1001/jamanetworkopen.2026.2443 · 2026-03-25

## TL;DR

This review explores how large language models are being used in pediatric clinical text analysis, highlighting common applications and significant gaps in reporting and evaluation standards.

## Contribution

The study maps emerging research on LLMs in pediatrics and identifies critical evidence gaps in implementation and evaluation.

## Key findings

- Most studies focused on diagnostic decision support and treatment planning in pediatrics.
- Only 2.5% of studies fully met MINIMAR reporting standards, and 75% lacked pediatric-specific fine-tuning.
- Early childhood populations were underrepresented in the analyzed studies.

## Abstract

How are large language models (LLMs) being used to analyze clinical text in pediatrics?

This scoping review of 40 studies, all published within the past 2 years, found diverse clinical applications of LLMs, most commonly for clinical decision support, across multiple pediatric subspecialties. However, there was limited use of transparent reporting, standardized evaluation methods, and ethical or data privacy safeguards.

This study’s results suggest that it is imperative to prioritize pediatric-specific data and adherence to rigorous reporting and evaluation standards to ensure safe and effective implementation of LLMs for analyzing clinical text in pediatrics.

This scoping review examines research on large language model use in pediatric clinical text analysis.

Large language models (LLMs) are increasingly being applied to analyze clinical data, primarily clinical text, with an increasing emphasis on integration in health care. However, the use of LLMs in pediatric care remains underexplored.

To map the emerging literature on LLM use in pediatrics involving clinical text and identify evidence gaps and future directions for implementation and evaluation.

PubMed/MEDLINE, Embase, Web of Science, Scopus, and preprint servers were searched for English-language original research published from January 1, 2020, to July 1, 2025. Included studies used modern transformer-based LLMs with pediatric clinical text as input. Two reviewers independently screened studies using predefined criteria. Data were extracted by one reviewer and verified by another. Findings were descriptively synthesized, and adherence to the Minimum Information for Medical AI Reporting (MINIMAR) standards was assessed.

The review included 40 studies published between 2023 and 2025. Twenty-three studies were conducted in the US, and all were retrospective observational studies using clinical data from sources such as electronic health records. Participant sample sizes ranged from 10 to 172 683. Although all pediatric age subgroups were represented, early childhood populations (aged 0-5 years) were underrepresented. The most common LLM clinical applications were diagnostic decision support in 24 studies (60.0%) and treatment planning in 7 studies (17.5%). Although all 40 studies conducted clinical evaluation of LLMs and 30 included discussions of ethics or data privacy, 39 studies (97.5%) did not meet full MINIMAR standards, 34 (85.0%) did not report use of Health Insurance Portability and Accountability Act–compliant models, and 30 (75.0%) lacked fine-tuning for pediatric-specific data. Among 33 studies assessing model performance against human annotations, 10 (30.3%) did not include clinicians as annotators; among 26 studies with multiple annotators, only 9 (34.6%) reported interannotator agreement statistics.

This scoping review found that diagnostic decision support and treatment planning were commonly proposed applications of LLMs in pediatrics. However, gaps in scientific rigor and limited use of pediatric-specific data may hinder their safe and effective implementation in pediatrics. Future studies should use standardized evaluation and reporting methods, increase clinician involvement, and expand research to underrepresented ages and clinical applications.

## Full-text entities

- **Diseases:** hallucinations (MESH:D006212), infectious diseases (MESH:D003141), LLM (MESH:D007806), allergy (MESH:D004342), refractive error (MESH:D012030), cancer (MESH:D009369)
- **Chemicals:** LLM (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13019234/full.md

---
Source: https://tomesphere.com/paper/PMC13019234