# SpiroLLM: Finetuning pretrained LLMs to understand spirogram time series with clinical validation in COPD reporting

**Authors:** Shuhao Mei, Yongchao Long, Xiaoyu Xiao, Shan Cao, Xiaobo Han, Shijia Geng, Jinbo Sun, Yuxi Zhou, Shenda Hong

PMC · DOI: 10.1371/journal.pdig.0001300 · PLOS Digital Health · 2026-03-24

## TL;DR

SpiroLLM is a new AI system that interprets breathing test charts and generates detailed COPD diagnostic reports, improving accuracy and transparency in diagnosis.

## Contribution

The novel SpiroLLM model is the first multimodal LLM capable of understanding spirogram time series and generating clinical reports with interpretability.

## Key findings

- SpiroLLM achieved a diagnostic AUROC of 0.8977 with a 95% confidence interval of 0.88-0.91.
- The model maintained a 100% valid response rate in robustness tests with missing data, outperforming text-only models.
- The system generates comprehensive reports by fusing spirogram morphology with PFT numerical values.

## Abstract

Chronic Obstructive Pulmonary Disease (COPD), a major chronic respiratory disease with persistent airflow limitation, is a leading global cause of disability and mortality. Respiratory spirogram time series, routinely collected during pulmonary function tests (PFTs), play a critical role in the early detection of respiratory diseases and in monitoring lung function over time. However, most current AI models for COPD diagnosis are limited to outputting classification results without providing a rationale for their diagnostic process, while current Large Language Models (LLMs) cannot understand spirograms yet, which severely limits their clinical trust and adoption. To tackle this challenge, we leverage a cohort of 234,028 individuals from the UK Biobank (UKB) to propose SpiroLLM, the first multimodal large language model that can understand spirogram. The model extracts morphological features from respiratory curves via a SpiroEncoder and aligns them with PFT numerical values in a unified latent space using a SpiroProjector, ultimately empowering a large language model to generate a comprehensive diagnostic report. Experimental results confirm that SpiroLLM achieved a diagnostic AUROC of 0.8977 (95% CI: 0.88-0.91). In a robustness test with missing core data, it maintained a 100% valid response rate, far surpassing the 13.4% of a text-only model and showcasing the superiority of its multimodal design. This work demonstrates the substantial potential of deeply fusing physiological signals with large language models, establishing a new paradigm for the next generation of interpretable and reliable clinical decision support tools.

Chronic Obstructive Pulmonary Disease (COPD) is a leading cause of death and disability worldwide, yet diagnosing it remains a significant challenge. Accurate diagnosis typically requires highly trained specialists to interpret complex charts from breathing tests—a process that is time-consuming and often unavailable in areas with limited medical resources. While artificial intelligence has shown promise in medicine, most existing tools only provide simple “yes or no” classifications without explaining the reasoning behind the diagnosis, which limits their usefulness and trustworthiness for doctors. In this study, we developed a new system called SpiroLLM to bridge this gap. Unlike traditional tools, our approach uses advanced technology to “look” at the shape of a patient’s breathing curves and combine this visual information with standard test numbers. The system then automatically writes a detailed, easy-to-understand clinical report, much like a human expert would. We tested our model using a large dataset from the UK Biobank and found it to be highly accurate and reliable, even when some patient data was missing. This work represents a step forward in digital health, offering a transparent assistant for clinicians that could make expert-level COPD diagnosis more accessible and consistent globally.

## Linked entities

- **Diseases:** Chronic Obstructive Pulmonary Disease (MONDO:0005002), COPD (MONDO:0005002)

## Full-text entities

- **Diseases:** death (MESH:D003643), COPD (MESH:D029424), CLF (OMIM:604595), LLM (MESH:D007806), airway obstruction (MESH:D000402), respiratory disease (MESH:D012140)
- **Chemicals:** Clin (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13012452/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13012452/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/PMC13012452/full.md

---
Source: https://tomesphere.com/paper/PMC13012452