Rhapsody: A Dataset for Highlight Detection in Podcasts

Younghan Park; Anuj Diwan; David Harwath; Eunsol Choi

arXiv:2505.19429·cs.CL·September 9, 2025

Rhapsody: A Dataset for Highlight Detection in Podcasts

Younghan Park, Anuj Diwan, David Harwath, Eunsol Choi

PDF

Open Access 1 Repo

TL;DR

This paper introduces Rhapsody, a large dataset for podcast highlight detection, and evaluates various models, revealing the difficulty of the task even for advanced language models and the benefits of fine-tuning with in-domain data.

Contribution

The paper presents Rhapsody, a novel dataset for segment-level podcast highlight detection, and provides a comprehensive evaluation of baseline models, emphasizing the challenges and potential of fine-tuning.

Findings

01

State-of-the-art language models struggle with highlight detection.

02

Fine-tuned models outperform zero-shot approaches.

03

Combining speech features and transcripts improves performance.

Abstract

Podcasts have become daily companions for half a billion users. Given the enormous amount of podcast content available, highlights provide a valuable signal that helps viewers get the gist of an episode and decide if they want to invest in listening to it in its entirety. However, identifying highlights automatically is challenging due to the unstructured and long-form nature of the content. We introduce Rhapsody, a dataset of 13K podcast episodes paired with segment-level highlight scores derived from YouTube's 'most replayed' feature. We frame the podcast highlight detection as a segment-level binary classification task. We explore various baseline approaches, including zero-shot prompting of language models and lightweight fine-tuned language models using segment-level classification heads. Our experimental results indicate that even state-of-the-art language models like GPT-4o and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

younghanstark/rhapsody
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadio, Podcasts, and Digital Media