Effective sampling for large-scale automated writing evaluation systems
Nicholas Dronen, Peter W. Foltz, Kyle Habermehl

TL;DR
This paper investigates efficient sampling algorithms to train automated writing evaluation systems with fewer essays, maintaining high accuracy while reducing costs in large-scale educational settings.
Contribution
It introduces novel sampling algorithms that select the most informative essays for training, optimizing model performance with smaller datasets.
Findings
Minimized training set sizes while maintaining accuracy
Reduced costs of human scoring in large-scale AWE
Enhanced efficiency of AWE system training processes
Abstract
Automated writing evaluation (AWE) has been shown to be an effective mechanism for quickly providing feedback to students. It has already seen wide adoption in enterprise-scale applications and is starting to be adopted in large-scale contexts. Training an AWE model has historically required a single batch of several hundred writing examples and human scores for each of them. This requirement limits large-scale adoption of AWE since human-scoring essays is costly. Here we evaluate algorithms for ensuring that AWE models are consistently trained using the most informative essays. Our results show how to minimize training set sizes while maximizing predictive performance, thereby reducing cost without unduly sacrificing accuracy. We conclude with a discussion of how to integrate this approach into large-scale AWE systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Machine Learning and Algorithms · Topic Modeling
