Can GPT-4 do L2 analytic assessment?
Stefano Bann\`o, Hari Krishna Vydana, Kate M. Knill, Mark J. F. Gales

TL;DR
This paper investigates GPT-4's ability to perform detailed second language writing assessments by correlating its zero-shot analytic scoring with established proficiency components, offering insights into automated L2 evaluation.
Contribution
It demonstrates GPT-4's potential to predict specific L2 writing proficiency components in a zero-shot setting, advancing automated analytic assessment methods.
Findings
Significant correlations between GPT-4 predictions and proficiency features.
GPT-4 can extract detailed analytic components from L2 writing.
Automated scoring aligns with human-annotated proficiency levels.
Abstract
Automated essay scoring (AES) to evaluate second language (L2) proficiency has been a firmly established technology used in educational contexts for decades. Although holistic scoring has seen advancements in AES that match or even exceed human performance, analytic scoring still encounters issues as it inherits flaws and shortcomings from the human scoring process. The recent introduction of large language models presents new opportunities for automating the evaluation of specific aspects of L2 writing proficiency. In this paper, we perform a series of experiments using GPT-4 in a zero-shot fashion on a publicly available dataset annotated with holistic scores based on the Common European Framework of Reference and aim to extract detailed information about their underlying analytic components. We observe significant correlations between the automatically predicted analytic scores and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Educational Assessment and Pedagogy · Educational Technology and Assessment
MethodsAttention Is All You Need · Dropout · Softmax · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · Linear Layer · Dense Connections · Label Smoothing · Residual Connection
