FEANEL: A Benchmark for Fine-Grained Error Analysis in K-12 English Writing

Jingheng Ye; Shen Wang; Jiaqi Chen; Hebin Wang; Deqing Zou; Yanyu Zhu; Jiwei Tang; Hai-Tao Zheng; Ruitong Liu; Haoyang Li; Yanfeng Wang; Qingsong Wen

arXiv:2511.22883·cs.CL·December 1, 2025

FEANEL: A Benchmark for Fine-Grained Error Analysis in K-12 English Writing

Jingheng Ye, Shen Wang, Jiaqi Chen, Hebin Wang, Deqing Zou, Yanyu Zhu, Jiwei Tang, Hai-Tao Zheng, Ruitong Liu, Haoyang Li, Yanfeng Wang, Qingsong Wen

PDF

Open Access

TL;DR

This paper introduces FEANEL, a benchmark with expert-annotated student essays and error taxonomy to evaluate and improve LLMs' ability to provide detailed feedback for K-12 English writing.

Contribution

It presents the FEANEL benchmark, including a large annotated dataset and taxonomy, to assess and enhance LLMs' error analysis and pedagogical skills in education.

Findings

01

Current LLMs show significant gaps in fine-grained error analysis.

02

The benchmark reveals specific areas where LLMs need improvement.

03

Experimental results highlight the need for advanced methods for educational feedback.

Abstract

Large Language Models (LLMs) have transformed artificial intelligence, offering profound opportunities for educational applications. However, their ability to provide fine-grained educational feedback for K-12 English writing remains underexplored. In this paper, we challenge the error analysis and pedagogical skills of LLMs by introducing the problem of Fine-grained Error Analysis for English Learners and present the Fine-grained Error ANalysis for English Learners (FEANEL) Benchmark. The benchmark comprises 1,000 essays written by elementary and secondary school students, and a well-developed English writing error taxonomy. Each error is annotated by language education experts and categorized by type, severity, and explanatory feedback, using a part-of-speech-based taxonomy they co-developed. We evaluate state-of-the-art LLMs on the FEANEL Benchmark to explore their error analysis and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Topic Modeling · Natural Language Processing Techniques