Can Large Language Models Reliably Extract Physiology Index Values from Coronary Angiography Reports?

Sofia Morgado; Filipa Valdeira; Niklas Sander; Diogo Ferreira; Marta Vilela; Miguel Menezes; Cl\'audia Soares

arXiv:2604.13077·cs.CL·April 16, 2026

Can Large Language Models Reliably Extract Physiology Index Values from Coronary Angiography Reports?

Sofia Morgado, Filipa Valdeira, Niklas Sander, Diogo Ferreira, Marta Vilela, Miguel Menezes, Cl\'audia Soares

PDF

TL;DR

This study evaluates the effectiveness of various Large Language Models in extracting physiological measurements from Portuguese coronary angiography reports, highlighting the potential and limitations of current models.

Contribution

First large-scale investigation of LLMs for extracting physiology indexes from Portuguese CAG reports, exploring different prompting strategies and evaluation methods.

Findings

01

Llama with zero-shot prompting achieved the best results.

02

GPT-OSS showed high robustness to prompt variations.

03

Constrained generation slightly decreased performance but enabled template adherence.

Abstract

Coronary angiography (CAG) reports contain clinically relevant physiological measurements, yet this information is typically in the form of unstructured natural language, limiting its use in research. We investigate the use of Large Language Models (LLMs) to automatically extract these values, along with their anatomical locations, from Portuguese CAG reports. To our knowledge, this study is the first addressing physiology indexes extraction from a large (1342 reports) corpus of CAG reports, and one of the few focusing on CAG or Portuguese clinical text. We explore local privacy-preserving general-purpose and medical LLMs under different settings. Prompting strategies included zero-shot, few-shot, and few-shot prompting with implausible examples. In addition, we apply constrained generation and introduce a post-processing step based on RegEx. Given the sparsity of measurements, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.