Abstractive Summarization of Spoken and Written Instructions with BERT
Alexandra Savelieva, Bryan Au-Yeung, and Vasanth Ramani

TL;DR
This paper applies BERTSum to generate abstractive summaries of spoken and written instructions, demonstrating high fluency and utility across diverse domains and outperforming current state-of-the-art models.
Contribution
First application of BERTSum to conversational language, with transfer learning and preprocessing techniques to handle speech disfluencies and domain diversity.
Findings
Achieves human-like fluency and utility in summaries
Outperforms SOTA on WikiHow articles
No performance loss on CNN/DailyMail dataset
Abstract
Summarization of speech is a difficult problem due to the spontaneity of the flow, disfluencies, and other issues that are not usually encountered in written texts. Our work presents the first application of the BERTSum model to conversational language. We generate abstractive summaries of narrated instructional videos across a wide variety of topics, from gardening and cooking to software configuration and sports. In order to enrich the vocabulary, we use transfer learning and pretrain the model on a few large cross-domain datasets in both written and spoken English. We also do preprocessing of transcripts to restore sentence segmentation and punctuation in the output of an ASR system. The results are evaluated with ROUGE and Content-F1 scoring for the How2 and WikiHow datasets. We engage human judges to score a set of summaries randomly selected from a dataset curated from HowTo100M…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
