DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing
Haneul Yoo, Jieun Han, So-Yeon Ahn, Alice Oh

TL;DR
This paper introduces DREsS, a large-scale, rubric-based dataset for automated essay scoring in EFL education, including real and synthetic data, to improve practical AES systems.
Contribution
It provides the first large-scale, rubric-based EFL essay dataset with real and synthetic samples, and proposes a corruption-based augmentation strategy to enhance AES performance.
Findings
Synthetic data improves baseline results by 45.44%
DREsS includes 48.9K samples with real and augmented essays
Enables more accurate and practical AES for EFL writing
Abstract
Automated essay scoring (AES) is a useful tool in English as a Foreign Language (EFL) writing education, offering real-time essay scores for students and instructors. However, previous AES models were trained on essays and scores irrelevant to the practical scenarios of EFL writing education and usually provided a single holistic score due to the lack of appropriate datasets. In this paper, we release DREsS, a large-scale, standard dataset for rubric-based automated essay scoring with 48.9K samples in total. DREsS comprises three sub-datasets: DREsS_New, DREsS_Std., and DREsS_CASE. We collect DREsS_New, a real-classroom dataset with 2.3K essays authored by EFL undergraduate students and scored by English education experts. We also standardize existing rubric-based essay scoring datasets as DREsS_Std. We suggest CASE, a corruption-based augmentation strategy for essays, which generates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStudent Assessment and Feedback · Educational Technology and Assessment · Educational Assessment and Pedagogy
