CLFEC: A New Task for Unified Linguistic and Factual Error Correction in paragraph-level Chinese Professional Writing

Jian Kai; Zidong Zhang; Jiwen Chen; Zhengxiang Wu; Songtao Sun; Fuyang Li; Yang Cao; Qiang Liu

arXiv:2602.23845·cs.CL·March 2, 2026

CLFEC: A New Task for Unified Linguistic and Factual Error Correction in paragraph-level Chinese Professional Writing

Jian Kai, Zidong Zhang, Jiwen Chen, Zhengxiang Wu, Songtao Sun, Fuyang Li, Yang Cao, Qiang Liu

PDF

Open Access

TL;DR

This paper introduces CLFEC, a new task for jointly correcting linguistic and factual errors in paragraph-level Chinese professional writing, supported by a new dataset and empirical analysis of correction paradigms.

Contribution

It presents the CLFEC task, constructs a multi-domain dataset, and systematically evaluates LLM-based correction methods for unified linguistic and factual error correction.

Findings

01

Unified correction outperforms decoupled methods.

02

Agentic workflows are effective with suitable models.

03

Challenges include limited generalization and evidence grounding.

Abstract

Chinese text correction has traditionally focused on spelling and grammar, while factual error correction is usually treated separately. However, in paragraph-level Chinese professional writing, linguistic (word/grammar/punctuation) and factual errors frequently co-occur and interact, making unified correction both necessary and challenging. This paper introduces CLFEC (Chinese Linguistic & Factual Error Correction), a new task for joint linguistic and factual correction. We construct a mixed, multi-domain Chinese professional writing dataset spanning current affairs, finance, law, and medicine. We then conduct a systematic study of LLM-based correction paradigms, from prompting to retrieval-augmented generation (RAG) and agentic workflows. The analysis reveals practical challenges, including limited generalization of specialized correction models, the need for evidence grounding for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Academic integrity and plagiarism