FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Yichong Leng, Xu Tan, Linchen Zhu, Jin Xu, Renqian Luo, Linquan Liu,, Tao Qin, Xiang-Yang Li, Ed Lin, Tie-Yan Liu

TL;DR
FastCorrect is a non-autoregressive error correction model for automatic speech recognition that significantly reduces inference latency while maintaining or improving correction accuracy, outperforming existing models.
Contribution
The paper introduces FastCorrect, a novel NAR model based on edit alignment that effectively balances speed and accuracy in ASR error correction.
Findings
Speeds up inference by 6-9 times
Reduces WER by 8-14%
Outperforms existing NAR models in accuracy
Abstract
Error correction techniques have been used to refine the output sentences from automatic speech recognition (ASR) models and achieve a lower word error rate (WER) than original ASR outputs. Previous works usually use a sequence-to-sequence model to correct an ASR output sentence autoregressively, which causes large latency and cannot be deployed in online ASR services. A straightforward solution to reduce latency, inspired by non-autoregressive (NAR) neural machine translation, is to use an NAR sequence generation model for ASR error correction, which, however, comes at the cost of significantly increased ASR error rate. In this paper, observing distinctive error patterns and correction operations (i.e., insertion, deletion, and substitution) in ASR, we propose FastCorrect, a novel NAR error correction model based on edit alignment. In training, FastCorrect aligns each source token from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling
