A baseline for multiple-choice testing in the university classroom
Aaron. D. Slepkov, Melissa L. Van Bussel, Kara. M. Fitze, and Wesley, S. Burr

TL;DR
This paper provides a comprehensive analysis of 182 classroom multiple-choice tests at a Canadian university, establishing baseline psychometric metrics and updating item quality guidelines for non-expert instructor-created assessments.
Contribution
It offers the first large-scale empirical baseline for classroom MCQ tests, introduces modified statistical measures, and updates item quality guidelines for non-expert test creators.
Findings
Established baseline psychometric metrics for classroom tests
Introduced modified statistical measures for test analysis
Updated item quality guidelines for instructors
Abstract
There is a broad literature in multiple-choice test development, both in terms of item-writing guidelines and psychometric functionality as a measurement tool. However, most of the published literature concerns multiple-choice testing in the context of expert-designed high-stakes standardized assessments, with little attention being paid to the use of the technique within non-expert instructor-created classroom examinations. In this work we present a quantitative analysis of a large corpus of multiple-choice tests deployed in the classrooms of a primarily undergraduate university in Canada. Our report aims to establish three related things: First, reporting on the functional and psychometric operation of 182 multiple-choice tests deployed in a variety of courses at all undergraduate levels of education establishes a much-needed baseline for actual as-deployed classroom tests. Second, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
