Linguistic Blind Spots in Clinical Decision Extraction

Mohamed Elgaar; Hadi Amiri

arXiv:2602.03942·cs.CL·February 5, 2026

Linguistic Blind Spots in Clinical Decision Extraction

Mohamed Elgaar, Hadi Amiri

PDF

Open Access 1 Video

TL;DR

This study analyzes linguistic features of clinical decision spans in medical notes, revealing category-specific patterns and identifying narrative spans as a key challenge for extraction accuracy, with implications for improving clinical NLP systems.

Contribution

It provides a detailed linguistic analysis of clinical decision extraction challenges, highlighting the impact of narrative style on extraction performance and suggesting boundary-tolerant approaches.

Findings

01

Exact-match recall is 48%, dropping to 24% for spans with high stopword or hedging content.

02

Relaxed overlap recall increases to 71%, indicating boundary disagreements are common.

03

Narrative spans in advice and precautions are a major blind spot for current extraction methods.

Abstract

Extracting medical decisions from clinical notes is a key step for clinical decision support and patient-facing care summaries. We study how the linguistic characteristics of clinical decisions vary across decision categories and whether these differences explain extraction failures. Using MedDec discharge summaries annotated with decision categories from the Decision Identification and Classification Taxonomy for Use in Medicine (DICTUM), we compute seven linguistic indices for each decision span and analyze span-level extraction recall of a standard transformer model. We find clear category-specific signatures: drug-related and problem-defining decisions are entity-dense and telegraphic, whereas advice and precaution decisions contain more narrative, with higher stopword and pronoun proportions and more frequent hedging and negation cues. On the validation split, exact-match recall is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Linguistic Blind Spots in Clinical Decision Extraction· underline

Taxonomy

TopicsMachine Learning in Healthcare · Electronic Health Records Systems · Topic Modeling