Prompting Whisper for Improved Verbatim Transcription and End-to-end Miscue Detection
Griffin Dietz Smith, Dianna Yee, Jennifer King Chen, Leah Findlater

TL;DR
This paper introduces a novel end-to-end speech recognition architecture that uses prompting with target reading text to enhance verbatim transcription accuracy and enable direct miscue detection, outperforming existing methods.
Contribution
It demonstrates that prompting with reading text improves transcription and miscue detection, and shows the feasibility of augmenting speech recognition for error detection.
Findings
Improved verbatim transcription performance with prompting.
Effective end-to-end miscue detection in reading speech.
Outperforms current state-of-the-art methods.
Abstract
Identifying mistakes (i.e., miscues) made while reading aloud is commonly approached post-hoc by comparing automatic speech recognition (ASR) transcriptions to the target reading text. However, post-hoc methods perform poorly when ASR inaccurately transcribes verbatim speech. To improve on current methods for reading error annotation, we propose a novel end-to-end architecture that incorporates the target reading text via prompting and is trained for both improved verbatim transcription and direct miscue detection. Our contributions include: first, demonstrating that incorporating reading text through prompting benefits verbatim transcription performance over fine-tuning, and second, showing that it is feasible to augment speech recognition tasks for end-to-end miscue detection. We conducted two case studies -- children's read-aloud and adult atypical speech -- and found that our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Language Development and Disorders
