An End-to-End Workflow using Topic Segmentation and Text Summarisation Methods for Improved Podcast Comprehension
Andrew Aquilina, Sean Diacono, Panagiotis Papapetrou, and Maria Movin

TL;DR
This paper investigates how combining topic segmentation and text summarisation can enhance podcast comprehension, demonstrating effective methods and models that produce concise, relevant summaries aligned closely with human titles.
Contribution
It introduces a novel workflow integrating TextTiling, TextSplit, and summarisation models T5, BART, Pegasus for improved podcast episode summarisation.
Findings
TextSplit achieved the best segmentation metrics.
T5 produced the most relevant summaries.
Summaries were only 8% less relevant than human titles.
Abstract
The consumption of podcast media has been increasing rapidly. Due to the lengthy nature of podcast episodes, users often carefully select which ones to listen to. Although episode descriptions aid users by providing a summary of the entire podcast, they do not provide a topic-by-topic breakdown. This study explores the combined application of topic segmentation and text summarisation methods to investigate how podcast episode comprehension can be improved. We have sampled 10 episodes from Spotify's English-Language Podcast Dataset and employed TextTiling and TextSplit to segment them. Moreover, three text summarisation models, namely T5, BART, and Pegasus, were applied to provide a very short title for each segment. The segmentation part was evaluated using our annotated sample with the and WindowDiff () metrics. A survey was also rolled out () to assess the quality of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadio, Podcasts, and Digital Media · FinTech, Crowdfunding, Digital Finance · Web Data Mining and Analysis
MethodsMulti-Head Attention · Attention Is All You Need · Adam · Attention Dropout · SentencePiece · Linear Layer · Softmax · Dense Connections · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia?
