HTEC: Human Transcription Error Correction
Hanbo Sun, Jian Gao, Xiaomin Wu, Anjie Fang, Cheng Cao, Zheng Du

TL;DR
HTEC is a novel two-stage system for human transcription error correction that significantly improves transcription quality, surpasses human annotators, and effectively assists human transcribers as a co-pilot.
Contribution
This paper introduces HTEC, a new method combining error detection and sequence generation, with novel correction operations and phoneme-aware embeddings, outperforming existing approaches.
Findings
HTEC reduces WER by 2.2% to 4.5% compared to other methods.
HTEC improves transcription quality by 15.1% when used as a co-pilot.
HTEC surpasses human annotators in transcription accuracy.
Abstract
High-quality human transcription is essential for training and improving Automatic Speech Recognition (ASR) models. Recent study~\cite{libricrowd} has found that every 1% worse transcription Word Error Rate (WER) increases approximately 2% ASR WER by using the transcriptions to train ASR models. Transcription errors are inevitable for even highly-trained annotators. However, few studies have explored human transcription correction. Error correction methods for other problems, such as ASR error correction and grammatical error correction, do not perform sufficiently for this problem. Therefore, we propose HTEC for Human Transcription Error Correction. HTEC consists of two stages: Trans-Checker, an error detection model that predicts and masks erroneous words, and Trans-Filler, a sequence-to-sequence generative model that fills masked positions. We propose a holistic list of correction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and Audio Processing
