# A Feasibility Study of Literature-Guided HRV Stratification Using Large Language Models

**Authors:** Tien-Yu Hsu, Gau-Jun Tang, Cheng-Han Wu, Jen-Tin Lee, Terry B. J. Kuo

PMC · DOI: 10.3390/diagnostics16040540 · 2026-02-11

## TL;DR

This study explores using large language models to help organize and apply heart rate variability research for clinical decision support, improving transparency and adaptability.

## Contribution

A novel LLM-assisted framework for transparent, literature-guided HRV-based risk stratification is introduced.

## Key findings

- The framework achieved 86% accuracy in HRV classification using literature-guided methods.
- It outperformed traditional models in transparency and adaptability to new research.
- The system demonstrated 81% sensitivity and 87% specificity in HRV-based risk stratification.

## Abstract

Background: Heart rate variability (HRV) is a valuable indicator for assessing vascular health, but keeping clinical decision support systems (CDSSs) aligned with the rapidly evolving literature remains challenging. This study aimed to develop an LLM-assisted literature synthesis framework to support transparent HRV-based risk stratification, enabling systematic extraction and organization of HRV evidence from published studies. Methods: An LLM-driven framework was developed to extract HRV parameters from 140 medical abstracts. The system simulated step-by-step human reasoning to identify key HRV indicators and group patient data using predefined statistical thresholds derived from the literature. System performance was evaluated using ECG-derived HRV features as a feasibility evaluation of literature-guided HRV classification. Results: The proposed framework demonstrated an accuracy of 86% in literature-guided HRV classification, with a sensitivity of 81% and a specificity of 87%. Compared with traditional machine learning approaches, the LLM-assisted system provided transparent, literature-grounded reasoning and could be readily updated as new studies became available. Conclusions: Large language models can support evidence-guided parameter selection and feasibility-level HRV-based risk stratification, rather than serving as predictive classifiers. This approach reduces manual effort, enhances transparency, and addresses common “black box” concerns associated with AI-assisted CDSS development in clinical practice.

## Full-text entities

- **Diseases:** heart failure (MESH:D006333), ventricular (MESH:D014693), LF (MESH:C565121), hypertension (MESH:D006973), death (MESH:D003643), Cerebrovascular Event (MESH:D002561), Cardiovascular Event (MESH:D002318), MLRD (MESH:D000069279), stroke (MESH:D020521), arrhythmia (MESH:D001145), HF (MESH:D006316), vascular dysregulation (MESH:D021081), arrhythmic (OMIM:212500), PTSD (MESH:D013313), injury to (MESH:D014947), psychiatric and neurological diseases (MESH:D001523)
- **Chemicals:** HF (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12939377/full.md

---
Source: https://tomesphere.com/paper/PMC12939377