LLM-Augmented Symptom Analysis for Cardiovascular Disease Risk Prediction: A Clinical NLP
Haowei Yang, Ziyu Shen, Junli Shao, Luyao Men, Xinyue Han, Jing Dong

TL;DR
This paper presents a novel NLP pipeline using domain-adapted large language models to extract symptoms from clinical notes, improving cardiovascular risk prediction accuracy and clinical relevance.
Contribution
It introduces an LLM-augmented clinical NLP approach with cardiovascular-specific fine-tuning and prompt-based reasoning, addressing challenges like hallucination and temporal ambiguity.
Findings
Enhanced prediction performance on MIMIC-III and CARDIO-NLP datasets.
High clinical relevance with kappa = 0.82.
Addresses key challenges in LLM clinical applications.
Abstract
Timely identification and accurate risk stratification of cardiovascular disease (CVD) remain essential for reducing global mortality. While existing prediction models primarily leverage structured data, unstructured clinical notes contain valuable early indicators. This study introduces a novel LLM-augmented clinical NLP pipeline that employs domain-adapted large language models for symptom extraction, contextual reasoning, and correlation from free-text reports. Our approach integrates cardiovascular-specific fine-tuning, prompt-based inference, and entity-aware reasoning. Evaluations on MIMIC-III and CARDIO-NLP datasets demonstrate improved performance in precision, recall, F1-score, and AUROC, with high clinical relevance (kappa = 0.82) assessed by cardiologists. Challenges such as contextual hallucination, which occurs when plausible information contracts with provided source, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare
