RSpell: Retrieval-augmented Framework for Domain Adaptive Chinese Spelling Check
Siqi Song, Qi Lv, Lei Geng, Ziqiang Cao, and Guohong Fu

TL;DR
RSpell is a retrieval-augmented framework for Chinese Spelling Check that dynamically incorporates domain-specific terms to improve error detection and correction across various fields, achieving state-of-the-art results.
Contribution
The paper introduces RSpell, a novel retrieval-augmented framework with adaptive control and iterative reasoning for domain-adaptive Chinese Spelling Check.
Findings
Achieves state-of-the-art performance in multiple domains.
Effective in zero-shot and fine-tuning scenarios.
Demonstrates robustness across law, medicine, and official documents.
Abstract
Chinese Spelling Check (CSC) refers to the detection and correction of spelling errors in Chinese texts. In practical application scenarios, it is important to make CSC models have the ability to correct errors across different domains. In this paper, we propose a retrieval-augmented spelling check framework called RSpell, which searches corresponding domain terms and incorporates them into CSC models. Specifically, we employ pinyin fuzzy matching to search for terms, which are combined with the input and fed into the CSC model. Then, we introduce an adaptive process control mechanism to dynamically adjust the impact of external knowledge on the model. Additionally, we develop an iterative strategy for the RSpell framework to enhance reasoning capabilities. We conducted experiments on CSC datasets in three domains: law, medicine, and official document writing. The results demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Second Language Acquisition and Learning
