Mask the Correct Tokens: An Embarrassingly Simple Approach for Error Correction
Kai Shen, Yichong Leng, Xu Tan, Siliang Tang, Yuan Zhang, Wenjie Liu,, Edward Lin

TL;DR
This paper introduces a simple masking strategy for error correction models that improves training efficiency and correction accuracy by better utilizing correct tokens during learning.
Contribution
The proposed masking approach enhances error correction models by reducing trivial copying and leveraging correct tokens more effectively, applicable across various models and tasks.
Findings
Consistent improvement in correction accuracy across multiple datasets
Effective in both autoregressive and non-autoregressive models
Reduces trivial copying of correct tokens during training
Abstract
Text error correction aims to correct the errors in text sequences such as those typed by humans or generated by speech recognition models. Previous error correction methods usually take the source (incorrect) sentence as encoder input and generate the target (correct) sentence through the decoder. Since the error rate of the incorrect sentence is usually low (e.g., 10\%), the correction model can only learn to correct on limited error tokens but trivially copy on most tokens (correct tokens), which harms the effective training of error correction. In this paper, we argue that the correct tokens should be better utilized to facilitate effective training and then propose a simple yet effective masking strategy to achieve this goal. Specifically, we randomly mask out a part of the correct tokens in the source sentence and let the model learn to not only correct the original error tokens…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
