TL;DR
This study provides strong evidence that across and within languages, more surprising sounds tend to be longer and less surprising sounds are produced faster, reflecting a surprisal-duration trade-off in human language processing.
Contribution
It demonstrates a universal surprisal--duration trade-off across 600 languages, combining cross-linguistic and within-language analyses with robust statistical controls.
Findings
Phones are produced faster in less surprising languages.
More surprising phones tend to be longer in duration.
Evidence supports a universal surprisal--duration trade-off.
Abstract
While there exist scores of natural languages, each with its unique features and idiosyncrasies, they all share a unifying theme: enabling human communication. We may thus reasonably predict that human cognition shapes how these languages evolve and are used. Assuming that the capacity to process information is roughly constant across human populations, we expect a surprisal--duration trade-off to arise both across and within languages. We analyse this trade-off using a corpus of 600 languages and, after controlling for several potential confounds, we find strong supporting evidence in both settings. Specifically, we find that, on average, phones are produced faster in languages where they are less surprising, and vice versa. Further, we confirm that more surprising phones are longer, on average, in 319 languages out of the 600. We thus conclude that there is strong evidence of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
