CEC-Zero: Zero-Supervision Character Error Correction with Self-Generated Rewards

Zhiming Lin; Kai Zhao; Sophie Zhang; Peilai Yu; Canran Xiao

arXiv:2512.23971·cs.CL·January 1, 2026

CEC-Zero: Zero-Supervision Character Error Correction with Self-Generated Rewards

Zhiming Lin, Kai Zhao, Sophie Zhang, Peilai Yu, Canran Xiao

PDF

Open Access 1 Video

TL;DR

CEC-Zero introduces a zero-supervision reinforcement learning framework for Chinese spelling correction, enabling large language models to self-correct errors without annotated data, significantly improving robustness and scalability.

Contribution

It presents a novel label-free reinforcement learning approach that allows LLMs to correct errors autonomously, outperforming supervised methods across multiple benchmarks.

Findings

01

Outperforms supervised baselines by 10-13 F1 points

02

Outperforms strong LLM fine-tunes by 5-8 F1 points

03

Provides theoretical guarantees of unbiased rewards and convergence

Abstract

Large-scale Chinese spelling correction (CSC) remains critical for real-world text processing, yet existing LLMs and supervised methods lack robustness to novel errors and rely on costly annotations. We introduce CEC-Zero, a zero-supervision reinforcement learning framework that addresses this by enabling LLMs to correct their own mistakes. CEC-Zero synthesizes errorful inputs from clean text, computes cluster-consensus rewards via semantic similarity and candidate agreement, and optimizes the policy with PPO. It outperforms supervised baselines by 10--13 F $_{1}$ points and strong LLM fine-tunes by 5--8 points across 9 benchmarks, with theoretical guarantees of unbiased rewards and convergence. CEC-Zero establishes a label-free paradigm for robust, scalable CSC, unlocking LLM potential in noisy text pipelines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

CEC-Zero: Zero-Supervision Character Error Correction with Self-Generated Rewards· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification