Automated Essay Scoring Using Grammatical Variety and Errors with Multi-Task Learning and Item Response Theory
Kosuke Doi, Katsuhito Sudoh, Satoshi Nakamura

TL;DR
This paper explores how grammatical features and multi-task learning, combined with Item Response Theory, enhance automatic essay scoring accuracy by incorporating grammatical correctness, errors, and difficulty weighting.
Contribution
It introduces a novel AES approach that integrates grammatical features, multi-task learning, and IRT-based weighting to improve scoring accuracy and reduce reliance on human-annotated grammar scores.
Findings
Grammatical features improve AES model performance.
Multi-task learning with grammar scores enhances scoring accuracy.
IRT-based weighting of grammatical features further boosts performance.
Abstract
This study examines the effect of grammatical features in automatic essay scoring (AES). We use two kinds of grammatical features as input to an AES model: (1) grammatical items that writers used correctly in essays, and (2) the number of grammatical errors. Experimental results show that grammatical features improve the performance of AES models that predict the holistic scores of essays. Multi-task learning with the holistic and grammar scores, alongside using grammatical features, resulted in a larger improvement in model performance. We also show that a model using grammar abilities estimated using Item Response Theory (IRT) as the labels for the auxiliary task achieved comparable performance to when we used grammar scores assigned by human raters. In addition, we weight the grammatical features using IRT to consider the difficulty of grammatical items and writers' grammar…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsInnovative Teaching and Learning Methods
