Transcribing Against Time
Matthias Sperber, Graham Neubig, Jan Niehues, Satoshi Nakamura, Alex, Waibel

TL;DR
This paper presents a dynamic, adaptive framework for correcting speech transcripts efficiently within a fixed time, significantly improving error correction rates by modeling transcriber effort and adapting to changing conditions.
Contribution
It introduces a real-time, transcriber-adaptive cost model training approach that enhances correction efficiency and captures dynamic factors affecting transcriber performance.
Findings
Efficiency improved by 15% on average
Up to 42% improvement for certain transcribers
Dynamic models capture fatigue and topic familiarity effects
Abstract
We investigate the problem of manually correcting errors from an automatic speech transcript in a cost-sensitive fashion. This is done by specifying a fixed time budget, and then automatically choosing location and size of segments for correction such that the number of corrected errors is maximized. The core components, as suggested by previous research [1], are a utility model that estimates the number of errors in a particular segment, and a cost model that estimates annotation effort for the segment. In this work we propose a dynamic updating framework that allows for the training of cost models during the ongoing transcription process. This removes the need for transcriber enrollment prior to the actual transcription, and improves correction efficiency by allowing highly transcriber-adaptive cost modeling. We first confirm and analyze the improvements afforded by this method in a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
