Extractive Summarization of EHR Discharge Notes
Emily Alsentzer, Anne Kim

TL;DR
This paper establishes an upper bound for extractive summarization of discharge notes and develops an LSTM model to identify key topics, aiding automated summarization and clinical decision-making.
Contribution
It introduces a new dataset and an LSTM-based approach for topic labeling in discharge notes, advancing extractive summarization techniques.
Findings
Achieved an F1 score of 0.876 for topic labeling.
Provided an upper bound for extractive summarization performance.
Demonstrated the potential for automated summarization in clinical settings.
Abstract
Patient summarization is essential for clinicians to provide coordinated care and practice effective communication. Automated summarization has the potential to save time, standardize notes, aid clinical decision making, and reduce medical errors. Here we provide an upper bound on extractive summarization of discharge notes and develop an LSTM model to sequentially label topics of history of present illness notes. We achieve an F1 score of 0.876, which indicates that this model can be employed to create a dataset for evaluation of extractive summarization methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Machine Learning in Healthcare
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
