Leveraging Pretrained Models for Automatic Summarization of   Doctor-Patient Conversations

Longxiang Zhang; Renato Negrinho; Arindam Ghosh; Vasudevan; Jagannathan; Hamid Reza Hassanzadeh; Thomas Schaaf; Matthew R. Gormley

arXiv:2109.12174·cs.CL·September 28, 2021·1 cites

Leveraging Pretrained Models for Automatic Summarization of Doctor-Patient Conversations

Longxiang Zhang, Renato Negrinho, Arindam Ghosh, Vasudevan, Jagannathan, Hamid Reza Hassanzadeh, Thomas Schaaf, Matthew R. Gormley

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that fine-tuning pretrained transformer models like BART can effectively generate high-quality summaries of doctor-patient conversations, even with limited data and long transcripts, surpassing previous methods and human performance.

Contribution

It introduces a multistage approach for summarizing long conversations by chunking and rewriting, improving summary quality and handling domain-specific challenges.

Findings

01

Fine-tuned BART outperforms previous work and human annotators.

02

Multistage approach improves handling of long conversations.

03

Automatic and human evaluations confirm high summary quality.

Abstract

Fine-tuning pretrained models for automatically summarizing doctor-patient conversation transcripts presents many challenges: limited training data, significant domain shift, long and noisy transcripts, and high target summary variability. In this paper, we explore the feasibility of using pretrained transformer models for automatically summarizing doctor-patient conversations directly from transcripts. We show that fluent and adequate summaries can be generated with limited training data by fine-tuning BART on a specially constructed dataset. The resulting models greatly surpass the performance of an average human annotator and the quality of previous published work for the task. We evaluate multiple methods for handling long conversations, comparing them to the obvious baseline of truncating the conversation to fit the pretrained model length limit. We introduce a multistage approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

negrinho/medical_conversation_summarization
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Dropout · Layer Normalization · Adam · Dense Connections · Byte Pair Encoding