The Relationship Between Surprisal and Prosodic Prominence in Conversation Reflects Intelligibility‐Oriented Pressures
Thomas Hikaru Clark, Moshe Poliak, Tamar Regev, A. J. Haskins, Caroline Robertson, Edward Gibson

TL;DR
This study explores how unpredictable words in conversation are linked to prosodic features like pitch and duration, suggesting speakers adjust their speech to improve listener understanding.
Contribution
The study provides new evidence that unpredictability in speech correlates with prosodic prominence, supporting intelligibility-oriented language production.
Findings
GPT-2 surprisal predicts higher duration, maximum pitch, and pitch range of words in conversation.
Listener backchannels are associated with spikes in speaker word surprisal.
Context window size affects model fit differently for maximum pitch versus other variables.
Abstract
Conversation is a dynamic, multimodal activity involving the exchange of complex streams of information like words, prosody, gesture, eye contact, and backchannels. Understanding how these different channels interact in naturalistic scenarios is essential for understanding the mechanisms governing human communication. Past studies suggested that the duration of words is tied to their predictability in context, but it remains unclear whether this relationship is speaker‐oriented (e.g., retrieval or production‐based) or due to listener‐oriented, intelligibility‐based pressures (i.e., emphasizing unpredictable words to ease comprehension). This study aims to examine the relationship between predictability and additional acoustic variables, to test how much intelligibility‐oriented principles impact conversation. We use the GPT‐2 large language model to assess the relationship between…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonetics and Phonology Research · Hearing Impairment and Communication · Multisensory perception and integration
