RSpell: Retrieval-augmented Framework for Domain Adaptive Chinese   Spelling Check

Siqi Song; Qi Lv; Lei Geng; Ziqiang Cao; and Guohong Fu

arXiv:2308.08176·cs.CL·August 31, 2023·1 cites

RSpell: Retrieval-augmented Framework for Domain Adaptive Chinese Spelling Check

Siqi Song, Qi Lv, Lei Geng, Ziqiang Cao, and Guohong Fu

PDF

Open Access 1 Repo

TL;DR

RSpell is a retrieval-augmented framework for Chinese Spelling Check that dynamically incorporates domain-specific terms to improve error detection and correction across various fields, achieving state-of-the-art results.

Contribution

The paper introduces RSpell, a novel retrieval-augmented framework with adaptive control and iterative reasoning for domain-adaptive Chinese Spelling Check.

Findings

01

Achieves state-of-the-art performance in multiple domains.

02

Effective in zero-shot and fine-tuning scenarios.

03

Demonstrates robustness across law, medicine, and official documents.

Abstract

Chinese Spelling Check (CSC) refers to the detection and correction of spelling errors in Chinese texts. In practical application scenarios, it is important to make CSC models have the ability to correct errors across different domains. In this paper, we propose a retrieval-augmented spelling check framework called RSpell, which searches corresponding domain terms and incorporates them into CSC models. Specifically, we employ pinyin fuzzy matching to search for terms, which are combined with the input and fed into the CSC model. Then, we introduce an adaptive process control mechanism to dynamically adjust the impact of external knowledge on the model. Additionally, we develop an iterative strategy for the RSpell framework to enhance reasoning capabilities. We conducted experiments on CSC datasets in three domains: law, medicine, and official document writing. The results demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

47777777/rspell
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Second Language Acquisition and Learning