Adverse Event Extraction from Discharge Summaries: A New Dataset, Annotation Scheme, and Initial Findings

Imane Guellil; Salom\'e Andres; Atul Anand; Bruce Guthrie; Huayu Zhang; Abul Hasan; Honghan Wu; Beatrice Alex

arXiv:2506.14900·cs.CL·June 19, 2025

Adverse Event Extraction from Discharge Summaries: A New Dataset, Annotation Scheme, and Initial Findings

Imane Guellil, Salom\'e Andres, Atul Anand, Bruce Guthrie, Huayu Zhang, Abul Hasan, Honghan Wu, Beatrice Alex

PDF

Open Access 1 Video

TL;DR

This paper introduces a new annotated dataset for extracting adverse events from discharge summaries of elderly patients, highlighting challenges in detecting complex, underrepresented clinical entities with transformer models.

Contribution

It provides a novel, richly annotated corpus supporting complex entity structures and evaluates multiple models, establishing a benchmark for adverse event extraction in clinical NLP.

Findings

01

Transformer models perform well at document-level coarse extraction (F1=0.943)

02

Performance drops significantly at fine-grained entity detection (F1=0.675)

03

Challenges remain in identifying rare events and nuanced language

Abstract

In this work, we present a manually annotated corpus for Adverse Event (AE) extraction from discharge summaries of elderly patients, a population often underrepresented in clinical NLP resources. The dataset includes 14 clinically significant AEs-such as falls, delirium, and intracranial haemorrhage, along with contextual attributes like negation, diagnosis type, and in-hospital occurrence. Uniquely, the annotation schema supports both discontinuous and overlapping entities, addressing challenges rarely tackled in prior work. We evaluate multiple models using FlairNLP across three annotation granularities: fine-grained, coarse-grained, and coarse-grained with negation. While transformer-based models (e.g., BERT-cased) achieve strong performance on document-level coarse-grained extraction (F1 = 0.943), performance drops notably for fine-grained entity-level tasks (e.g., F1 = 0.675),…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Adverse Event Extraction from Discharge Summaries: A New Dataset, Annotation Scheme, and Initial Findings· underline

Taxonomy

TopicsTopic Modeling · Digital and Cyber Forensics · Software Engineering Research