Detecting Extraneous Content in Podcasts
Sravana Reddy, Yongze Yu, Aasish Pappu, Aswin Sivaraman, Rezvaneh, Rezapour, Rosie Jones

TL;DR
This paper introduces classifiers that utilize textual and listening pattern analysis to identify extraneous content in podcasts, improving summarization quality by reducing irrelevant material and enhancing ROUGE scores.
Contribution
The study presents a novel approach combining textual and listening features for detecting extraneous podcast content, which improves summarization accuracy.
Findings
Enhanced ROUGE scores in podcast summarization
Effective detection of extraneous content
Reduction of irrelevant material in summaries
Abstract
Podcast episodes often contain material extraneous to the main content, such as advertisements, interleaved within the audio and the written descriptions. We present classifiers that leverage both textual and listening patterns in order to detect such content in podcast descriptions and audio transcripts. We demonstrate that our models are effective by evaluating them on the downstream task of podcast summarization and show that we can substantively improve ROUGE scores and reduce the extraneous content generated in the summaries.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
