What's in a Summary? Laying the Groundwork for Advances in Hospital-Course Summarization
Griffin Adams, Emily Alsentzer, Mert Ketenci, Jason Zucker, No\'emie, Elhadad

TL;DR
This paper introduces hospital-course summarization, a new task in clinical NLP, by creating a large dataset and analyzing the unique challenges of summarizing hospital stay documentation.
Contribution
The authors present a novel dataset of 109,000 hospitalizations for training and evaluating hospital-course summarization models, and analyze the characteristics of clinician-authored summaries.
Findings
BHC paragraphs are highly abstractive with some long extracted fragments.
Summaries are concise, comprehensive, and stylistically different from source notes.
The dataset reveals minimal lexical cohesion and serves as a silver-standard reference.
Abstract
Summarization of clinical narratives is a long-standing research problem. Here, we introduce the task of hospital-course summarization. Given the documentation authored throughout a patient's hospitalization, generate a paragraph that tells the story of the patient admission. We construct an English, text-to-text dataset of 109,000 hospitalizations (2M source notes) and their corresponding summary proxy: the clinician-authored "Brief Hospital Course" paragraph written as part of a discharge note. Exploratory analyses reveal that the BHC paragraphs are highly abstractive with some long extracted fragments; are concise yet comprehensive; differ in style and content organization from the source notes; exhibit minimal lexical cohesion; and represent silver-standard references. Our analysis identifies multiple implications for modeling this complex, multi-document summarization task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
