Peer Grading in a Course on Algorithms and Data Structures: Machine Learning Algorithms do not Improve over Simple Baselines
Mehdi S. M. Sajjadi, Morteza Alamgir, Ulrike von Luxburg

TL;DR
This study investigates peer grading in a university course on algorithms, finding that machine learning algorithms do not outperform simple averaging of peer grades in accuracy.
Contribution
The paper provides an empirical analysis of peer grading data and evaluates various statistical and machine learning methods for grade aggregation, revealing their limited effectiveness.
Findings
Machine learning algorithms do not outperform simple mean peer grades.
Thorough dataset analysis highlights limitations of current aggregation methods.
Peer grading can be as effective as complex models in this context.
Abstract
Peer grading is the process of students reviewing each others' work, such as homework submissions, and has lately become a popular mechanism used in massive open online courses (MOOCs). Intrigued by this idea, we used it in a course on algorithms and data structures at the University of Hamburg. Throughout the whole semester, students repeatedly handed in submissions to exercises, which were then evaluated both by teaching assistants and by a peer grading mechanism, yielding a large dataset of teacher and peer grades. We applied different statistical and machine learning methods to aggregate the peer grades in order to come up with accurate final grades for the submissions (supervised and unsupervised, methods based on numeric scores and ordinal rankings). Surprisingly, none of them improves over the baseline of using the mean peer grade as the final grade. We discuss a number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics · Student Assessment and Feedback · Educational Technology and Assessment
