Exploring Optimal Granularity for Extractive Summarization of   Unstructured Health Records: Analysis of the Largest Multi-Institutional   Archive of Health Records in Japan

Kenichiro Ando; Takashi Okumura; Mamoru Komachi; Hiromasa Horiguchi,; Yuji Matsumoto

arXiv:2209.10041·cs.CL·December 21, 2022

Exploring Optimal Granularity for Extractive Summarization of Unstructured Health Records: Analysis of the Largest Multi-Institutional Archive of Health Records in Japan

Kenichiro Ando, Takashi Okumura, Mamoru Komachi, Hiromasa Horiguchi,, Yuji Matsumoto

PDF

TL;DR

This study investigates the optimal level of detail for extractive summarization of unstructured health records, finding that clinical segments provide the best accuracy among tested granularities, aiding automated discharge summary generation.

Contribution

It introduces a method to define and automatically split clinical segments, and compares their effectiveness with sentences and clauses for extractive summarization in medical texts.

Findings

01

Clinical segments outperform sentences and clauses in summarization accuracy.

02

Machine learning-based splitting of clinical segments achieved high F1 score of 0.846.

03

Finer granularity (clinical segments) improves extractive summarization of health records.

Abstract

Automated summarization of clinical texts can reduce the burden of medical professionals. "Discharge summaries" are one promising application of the summarization, because they can be generated from daily inpatient records. Our preliminary experiment suggests that 20-31% of the descriptions in discharge summaries overlap with the content of the inpatient records. However, it remains unclear how the summaries should be generated from the unstructured source. To decompose the physician's summarization process, this study aimed to identify the optimal granularity in summarization. We first defined three types of summarization units with different granularities to compare the performance of the discharge summary generation: whole sentences, clinical segments, and clauses. We defined clinical segments in this study, aiming to express the smallest medically meaningful concepts. To obtain the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.