Automated Text Scoring in the Age of Generative AI for the GPU-poor
Christopher Michael Ormerod, Alexander Kwako

TL;DR
This paper evaluates the potential of small, open-source generative language models for automated text scoring on modest hardware, highlighting their performance, efficiency, and capacity to generate feedback, as alternatives to proprietary models.
Contribution
It explores the use of open-source, small-scale GLMs for ATS, demonstrating their fine-tuning capabilities and initial steps towards automated feedback generation.
Findings
GLMs can be fine-tuned for adequate ATS performance
Open-source models work on modest hardware
Model-generated feedback shows potential but needs further validation
Abstract
Current research on generative language models (GLMs) for automated text scoring (ATS) has focused almost exclusively on querying proprietary models via Application Programming Interfaces (APIs). Yet such practices raise issues around transparency and security, and these methods offer little in the way of efficiency or customizability. With the recent proliferation of smaller, open-source models, there is the option to explore GLMs with computers equipped with modest, consumer-grade hardware, that is, for the "GPU poor." In this study, we analyze the performance and efficiency of open-source, small-scale GLMs for ATS. Results show that GLMs can be fine-tuned to achieve adequate, though not state-of-the-art, performance. In addition to ATS, we take small steps towards analyzing models' capacity for generating feedback by prompting GLMs to explain their scores. Model-generated feedback…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems · Topic Modeling · Image Processing and 3D Reconstruction
