System Report for CCL24-Eval Task 7: Multi-Error Modeling and   Fluency-Targeted Pre-training for Chinese Essay Evaluation

Jingshen Zhang; Xiangyu Yang; Xinkai Su; Xinglu Chen; Tianyou Huang,; Xinying Qiu

arXiv:2407.08206·cs.CL·July 12, 2024

System Report for CCL24-Eval Task 7: Multi-Error Modeling and Fluency-Targeted Pre-training for Chinese Essay Evaluation

Jingshen Zhang, Xiangyu Yang, Xinkai Su, Xinglu Chen, Tianyou Huang,, Xinying Qiu

PDF

Open Access

TL;DR

This paper details a comprehensive system for Chinese essay evaluation, employing multi-error modeling and fluency-focused pre-training, achieving top results in the CCL-2024 challenge.

Contribution

It introduces novel multi-error modeling techniques and fluency-targeted pre-training strategies specifically for Chinese essay evaluation tasks.

Findings

01

Achieved first place in CCL-2024 fluency evaluation task.

02

Improved error detection with binary classification models.

03

Enhanced fluency assessment using back-translation and NSP-based strategies.

Abstract

This system report presents our approaches and results for the Chinese Essay Fluency Evaluation (CEFE) task at CCL-2024. For Track 1, we optimized predictions for challenging fine-grained error types using binary classification models and trained coarse-grained models on the Chinese Learner 4W corpus. In Track 2, we enhanced performance by constructing a pseudo-dataset with multiple error types per sentence. For Track 3, where we achieved first place, we generated fluency-rated pseudo-data via back-translation for pre-training and used an NSP-based strategy with Symmetric Cross Entropy loss to capture context and mitigate long dependencies. Our methods effectively address key challenges in Chinese Essay Fluency Evaluation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling