Modeling Language Usage and Listener Engagement in Podcasts
Sravana Reddy, Marina Lazarova, Yongze Yu, and Rosie Jones

TL;DR
This study analyzes how linguistic features like vocabulary, emotion, and syntax in podcasts relate to listener engagement, using data-driven models to identify highly predictive stylistic factors.
Contribution
It introduces a comprehensive analysis linking linguistic style to engagement, validating some popular beliefs and offering new insights.
Findings
Certain linguistic features strongly predict engagement
Vocabulary diversity and emotional tone are key factors
Models achieve high accuracy in predicting engagement levels
Abstract
While there is an abundance of popular writing targeted to podcast creators on how to speak in ways that engage their listeners, there has been little data-driven analysis of podcasts that relates linguistic style with listener engagement. In this paper, we investigate how various factors -- vocabulary diversity, distinctiveness, emotion, and syntax, among others -- correlate with engagement, based on analysis of the creators' written descriptions and transcripts of the audio. We build models with different textual representations, and show that the identified features are highly predictive of engagement. Our analysis tests popular wisdom about stylistic elements in high-engagement podcasts, corroborating some aspects, and adding new perspectives on others.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
