Mixture of Small and Large Models for Chinese Spelling Check

Ziheng Qiao; Houquan Zhou; Zhenghua Li

arXiv:2506.06887·cs.CL·June 10, 2025

Mixture of Small and Large Models for Chinese Spelling Check

Ziheng Qiao, Houquan Zhou, Zhenghua Li

PDF

Open Access 1 Video

TL;DR

This paper introduces a dynamic mixture method combining small models and large language models during decoding to improve Chinese Spelling Check accuracy, achieving state-of-the-art results without fine-tuning LLMs.

Contribution

It proposes a novel mixture approach that balances small models and LLMs during decoding, avoiding fine-tuning and enhancing correction performance.

Findings

01

Significant improvement in error correction accuracy.

02

State-of-the-art results across multiple datasets.

03

Elimination of fine-tuning LLMs reduces resource requirements.

Abstract

In the era of large language models (LLMs), the Chinese Spelling Check (CSC) task has seen various LLM methods developed, yet their performance remains unsatisfactory. In contrast, fine-tuned BERT-based models, relying on high-quality in-domain data, show excellent performance but suffer from edit pattern overfitting. This paper proposes a novel dynamic mixture approach that effectively combines the probability distributions of small models and LLMs during the beam search decoding phase, achieving a balanced enhancement of precise corrections from small models and the fluency of LLMs. This approach also eliminates the need for fine-tuning LLMs, saving significant time and resources, and facilitating domain adaptation. Comprehensive experiments demonstrate that our mixture approach significantly boosts error correction capabilities, achieving state-of-the-art results across multiple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Mixture of Small and Large Models for Chinese Spelling Check· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis