C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts

Chenxi Qing; Junxi Wu; Zheng Liu; Yixiang Qiu; Hongyao Yu; Bin Chen; Hao Wu; Shu-Tao Xia

arXiv:2604.11796·cs.CL·May 20, 2026

C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts

Chenxi Qing, Junxi Wu, Zheng Liu, Yixiang Qiu, Hongyao Yu, Bin Chen, Hao Wu, Shu-Tao Xia

PDF

1 Repo

TL;DR

C-ReD is a new Chinese benchmark dataset designed to improve AI-generated text detection, especially in real-world scenarios, by addressing limitations of previous datasets in diversity, domain coverage, and realism.

Contribution

It introduces a comprehensive Chinese detection benchmark that enhances model generalization and addresses prior dataset limitations.

Findings

01

Enables reliable in-domain detection of AI-generated Chinese text.

02

Supports strong generalization to unseen LLMs and external datasets.

03

Addresses critical gaps in existing Chinese detection benchmarks.

Abstract

Recently, large language models (LLMs) are capable of generating highly fluent textual content. While they offer significant convenience to humans, they also introduce various risks, like phishing and academic dishonesty. Numerous research efforts have been dedicated to developing algorithms for detecting AI-generated text and constructing relevant datasets. However, in the domain of Chinese corpora, challenges remain, including limited model diversity and data homogeneity. To address these issues, we propose C-ReD: a comprehensive Chinese Real-prompt AI-generated Detection benchmark. Experiments demonstrate that C-ReD not only enables reliable in-domain detection but also supports strong generalization to unseen LLMs and external Chinese datasets-addressing critical gaps in model diversity, domain coverage, and prompt realism that have limited prior Chinese detection benchmarks. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HeraldofLight/C-ReD
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.