Using language models in the implicit automated assessment of mathematical short answer items
Christopher Ormerod

TL;DR
This paper introduces a novel language model pipeline for assessing mathematical short answers by identifying key values, improving accuracy over traditional scoring, and enabling targeted feedback for students and teachers.
Contribution
The paper presents a new value identification pipeline using fine-tuned language models that enhances assessment accuracy and informativeness for mathematical responses.
Findings
The pipeline accurately identifies implicit and explicit key values in responses.
It outperforms traditional rubric-based scoring methods.
Provides targeted feedback to improve student understanding.
Abstract
We propose a new way to assess certain short constructed responses to mathematics items. Our approach uses a pipeline that identifies the key values specified by the student in their response. This allows us to determine the correctness of the response, as well as identify any misconceptions. The information from the value identification pipeline can then be used to provide feedback to the teacher and student. The value identification pipeline consists of two fine-tuned language models. The first model determines if a value is implicit in the student response. The second model identifies where in the response the key value is specified. We consider both a generic model that can be used for any prompt and value, as well as models that are specific to each prompt and value. The value identification pipeline is a more accurate and informative way to assess short constructed responses than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Educational Assessment and Pedagogy · Educational Technology and Assessment
