Chinese Spelling Correction as Rephrasing Language Model
Linfeng Liu, Hongqiu Wu, Hai Zhao

TL;DR
This paper introduces Rephrasing Language Model (ReLM) for Chinese Spelling Correction, which rephrases entire sentences instead of character tagging, leading to improved accuracy and better transferability across tasks.
Contribution
The paper proposes a novel rephrasing-based training paradigm for CSC that outperforms existing methods and enhances transferability of language representations.
Findings
Achieves new state-of-the-art results on CSC benchmarks.
Outperforms previous methods significantly in both fine-tuned and zero-shot settings.
Learns transferable language representations when jointly trained with other tasks.
Abstract
This paper studies Chinese Spelling Correction (CSC), which aims to detect and correct the potential spelling errors in a given sentence. Current state-of-the-art methods regard CSC as a sequence tagging task and fine-tune BERT-based models on sentence pairs. However, we note a critical flaw in the process of tagging one character to another, that the correction is excessively conditioned on the error. This is opposite from human mindset, where individuals rephrase the complete sentence based on its semantics, rather than solely on the error patterns memorized before. Such a counter-intuitive learning process results in the bottleneck of generalizability and transferability of machine spelling correction. To address this, we propose Rephrasing Language Model (ReLM), where the model is trained to rephrase the entire sentence by infilling additional slots, instead of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Macropodus/macbert4mdcspell_v1model· 40k dl· ♡ 240k dl♡ 2
- 🤗Macropodus/macbert4csc_v2model· 8 dl· ♡ 28 dl♡ 2
- 🤗Macropodus/macbert4csc_v1model· 5 dl· ♡ 15 dl♡ 1
- 🤗Macropodus/bert4csc_v1model· 4 dl· ♡ 14 dl♡ 1
- 🤗Macropodus/relm_v1model· 42 dl· ♡ 142 dl♡ 1
- 🤗Macropodus/macbert4mdcspell_v2model· 283 dl· ♡ 6283 dl♡ 6
- 🤗Macropodus/macbert4mdcspell_v3model· 310 dl· ♡ 1310 dl♡ 1
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques
