Short Answer Grading Using One-shot Prompting and Text Similarity Scoring Model
Su-Youn Yoon

TL;DR
This paper presents an automated short answer grading system that combines one-shot prompting with text similarity scoring to provide detailed analytic scores and holistic assessments, improving interpretability and feedback quality.
Contribution
The study introduces a novel approach combining LLM-based one-shot prompting with a text similarity model for short answer grading, addressing data annotation challenges.
Findings
Achieved 0.67 accuracy and 0.71 quadratic weighted kappa.
Significant improvement over majority baseline.
Enhanced interpretability with sub-question scoring.
Abstract
In this study, we developed an automated short answer grading (ASAG) model that provided both analytic scores and final holistic scores. Short answer items typically consist of multiple sub-questions, and providing an analytic score and the text span relevant to each sub-question can increase the interpretability of the automated scores. Furthermore, they can be used to generate actionable feedback for students. Despite these advantages, most studies have focused on predicting only holistic scores due to the difficulty in constructing dataset with manual annotations. To address this difficulty, we used large language model (LLM)-based one-shot prompting and a text similarity scoring model with domain adaptation using small manually annotated dataset. The accuracy and quadratic weighted kappa of our model were 0.67 and 0.71 on a subset of the publicly available ASAG dataset. The model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Online Learning and Analytics
