TL;DR
This paper proposes a cross-prompt pre-finetuning method for automated short answer scoring that leverages key phrases and cross-prompt data to reduce annotation costs and improve accuracy in resource-limited settings.
Contribution
It introduces a two-phase approach using key phrases and cross-prompt data to enhance scoring accuracy with limited new prompt data.
Findings
Finetuning on cross-prompt data improves accuracy.
Key phrases significantly boost scoring performance.
Method is effective with limited training data.
Abstract
Automated Short Answer Scoring (SAS) is the task of automatically scoring a given input to a prompt based on rubrics and reference answers. Although SAS is useful in real-world applications, both rubrics and reference answers differ between prompts, thus requiring a need to acquire new data and train a model for each new prompt. Such requirements are costly, especially for schools and online courses where resources are limited and only a few prompts are used. In this work, we attempt to reduce this cost through a two-phase approach: train a model on existing rubrics and answers with gold score signals and finetune it on a new prompt. Specifically, given that scoring rubrics and reference answers differ for each prompt, we utilize key phrases, or representative expressions that the answer should contain to increase scores, and train a SAS model to learn the relationship between key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
