Characteristics of hand and machine-assigned scores to college students'   answers to open-ended tasks

Stephen P. Klein

arXiv:0805.2829·stat.AP·December 18, 2008

Characteristics of hand and machine-assigned scores to college students' answers to open-ended tasks

Stephen P. Klein

PDF

TL;DR

This study shows that machine scoring of open-ended college exam responses is highly consistent with human graders, correlates well with academic and standardized test scores, and does not introduce bias, making it suitable for large-scale assessments.

Contribution

The paper demonstrates that machine scoring is a reliable, valid, and unbiased method for grading open-ended responses in higher education assessments.

Findings

01

High inter-reader agreement in human scoring

02

Machine scores correlate strongly with human scores

03

Machine scoring does not increase score disparities across groups

Abstract

Assessment of learning in higher education is a critical concern to policy makers, educators, parents, and students. And, doing so appropriately is likely to require including constructed response tests in the assessment system. We examined whether scoring costs and other concerns with using open-end measures on a large scale (e.g., turnaround time and inter-reader consistency) could be addressed by machine grading the answers. Analyses with 1359 students from 14 colleges found that two human readers agreed highly with each other in the scores they assigned to the answers to three types of open-ended questions. These reader assigned scores also agreed highly with those assigned by a computer. The correlations of the machine-assigned scores with SAT scores, college grades, and other measures were comparable to the correlations of these variables with the hand-assigned scores. Machine…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.