Hearings and mishearings: decrypting the spoken word
Anita Mehta, Jean-Marc Luck

TL;DR
This paper introduces a universal, linguistically-inspired model for speech perception that accounts for mishearings and predicts when word recognition is easy or difficult, highlighting dynamic transitions in perception.
Contribution
It presents a novel phenomenological model with a simple formalism for word length and mishearings, extending understanding of speech perception dynamics and anticipation thresholds.
Findings
Recognition is easier for shorter words below a threshold.
Mishearings significantly impair word recognition, especially in clusters.
A dynamical transition occurs before the static threshold, indicating complex perception behavior.
Abstract
We propose a model of the speech perception of individual words in the presence of mishearings. This phenomenological approach is based on concepts used in linguistics, and provides a formalism that is universal across languages. We put forward an efficient two-parameter form for the word length distribution, and introduce a simple representation of mishearings, which we use in our subsequent modelling of word recognition. In a context-free scenario, word recognition often occurs via anticipation when, part-way into a word, we can correctly guess its full form. We give a quantitative estimate of this anticipation threshold when no mishearings occur, in terms of model parameters. As might be expected, the whole anticipation effect disappears when there are sufficiently many mishearings. Our global approach to the problem of speech perception is in the spirit of an optimisation problem.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
