Automated Essay Scoring Using Grammatical Variety and Errors with   Multi-Task Learning and Item Response Theory

Kosuke Doi; Katsuhito Sudoh; Satoshi Nakamura

arXiv:2406.08817·cs.CL·June 14, 2024·2 cites

Automated Essay Scoring Using Grammatical Variety and Errors with Multi-Task Learning and Item Response Theory

Kosuke Doi, Katsuhito Sudoh, Satoshi Nakamura

PDF

Open Access 1 Video

TL;DR

This paper explores how grammatical features and multi-task learning, combined with Item Response Theory, enhance automatic essay scoring accuracy by incorporating grammatical correctness, errors, and difficulty weighting.

Contribution

It introduces a novel AES approach that integrates grammatical features, multi-task learning, and IRT-based weighting to improve scoring accuracy and reduce reliance on human-annotated grammar scores.

Findings

01

Grammatical features improve AES model performance.

02

Multi-task learning with grammar scores enhances scoring accuracy.

03

IRT-based weighting of grammatical features further boosts performance.

Abstract

This study examines the effect of grammatical features in automatic essay scoring (AES). We use two kinds of grammatical features as input to an AES model: (1) grammatical items that writers used correctly in essays, and (2) the number of grammatical errors. Experimental results show that grammatical features improve the performance of AES models that predict the holistic scores of essays. Multi-task learning with the holistic and grammar scores, alongside using grammatical features, resulted in a larger improvement in model performance. We also show that a model using grammar abilities estimated using Item Response Theory (IRT) as the labels for the auxiliary task achieved comparable performance to when we used grammar scores assigned by human raters. In addition, we weight the grammatical features using IRT to consider the difficulty of grammatical items and writers' grammar…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Automated Essay Scoring Using Grammatical Variety and Errors with Multi-Task Learning and Item Response Theory· underline

Taxonomy

TopicsInnovative Teaching and Learning Methods