Detecting Extraneous Content in Podcasts

Sravana Reddy; Yongze Yu; Aasish Pappu; Aswin Sivaraman; Rezvaneh; Rezapour; Rosie Jones

arXiv:2103.02585·cs.CL·June 15, 2021

Detecting Extraneous Content in Podcasts

Sravana Reddy, Yongze Yu, Aasish Pappu, Aswin Sivaraman, Rezvaneh, Rezapour, Rosie Jones

PDF

TL;DR

This paper introduces classifiers that utilize textual and listening pattern analysis to identify extraneous content in podcasts, improving summarization quality by reducing irrelevant material and enhancing ROUGE scores.

Contribution

The study presents a novel approach combining textual and listening features for detecting extraneous podcast content, which improves summarization accuracy.

Findings

01

Enhanced ROUGE scores in podcast summarization

02

Effective detection of extraneous content

03

Reduction of irrelevant material in summaries

Abstract

Podcast episodes often contain material extraneous to the main content, such as advertisements, interleaved within the audio and the written descriptions. We present classifiers that leverage both textual and listening patterns in order to detect such content in podcast descriptions and audio transcripts. We demonstrate that our models are effective by evaluating them on the downstream task of podcast summarization and show that we can substantively improve ROUGE scores and reduce the extraneous content generated in the summaries.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.