A study on native American English speech recognition by Indian listeners with varying word familiarity level
Abhayjeet Singh, Achuth Rao MV, Rakesh Vaideeswaran, Chiranjeevi, Yarra, Prasanta Kumar Ghosh

TL;DR
This study evaluates Indian listeners' recognition of American English speech across sentence difficulty levels and compares human performance with various automatic speech recognition systems, highlighting the influence of word familiarity and speaker nativity.
Contribution
It introduces a comprehensive analysis of native American English speech recognition by Indian listeners, considering word familiarity and speaker nativity, and compares human and machine recognition performance.
Findings
Recognition difficulty increases from easy to hard sentences.
ASR3 with combined models performs best among tested systems.
Certain speaker nativities are more challenging for Indian listeners.
Abstract
In this study, listeners of varied Indian nativities are asked to listen and recognize TIMIT utterances spoken by American speakers. We have three kinds of responses from each listener while they recognize an utterance: 1. Sentence difficulty ratings, 2. Speaker difficulty ratings, and 3. Transcription of the utterance. From these transcriptions, word error rate (WER) is calculated and used as a metric to evaluate the similarity between the recognized and the original sentences.The sentences selected in this study are categorized into three groups: Easy, Medium and Hard, based on the frequency ofoccurrence of the words in them. We observe that the sentence, speaker difficulty ratings and the WERs increase from easy to hard categories of sentences. We also compare the human speech recognition performance with that using three automatic speech recognition (ASR) under following three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Natural Language Processing Techniques
Methods7 Fastest Ways to Call American Airlines Reservations Number (USA Guide) · Attention Model
