Parameter estimation in Comparative Judgement
Ian Hamilton, Nick Tawn

TL;DR
This paper examines the impact of penalty choices in parameter estimation for Comparative Judgement, revealing that common penalties can cause bias under adaptive scheduling and proposing a bootstrap-based method for improved, robust estimates.
Contribution
It identifies limitations of standard penalties in adaptive Comparative Judgement and introduces a bootstrap-based approach that enhances estimation accuracy and robustness.
Findings
Common penalties can cause bias in adaptive scheduling.
Bootstrap method improves parameter estimation accuracy.
Proposed approach is robust across different data conditions.
Abstract
Comparative Judgement is an assessment method where item ratings are estimated based on rankings of subsets of the items. These rankings are typically pairwise, with ratings taken to be the estimated parameters from fitting a Bradley-Terry model. Likelihood penalization is often employed. Adaptive scheduling of the comparisons can increase the efficiency of the assessment. We show that the most commonly used penalty is not the best-performing penalty under adaptive scheduling and can lead to substantial bias in parameter estimates. We demonstrate this using simulated and real data and provide a theoretical explanation for the relative performance of the penalties considered. Further, we propose a superior approach based on bootstrapping. It is shown to produce better parameter estimates for adaptive schedules and to be robust to variations in underlying strength distributions and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsJury Decision Making Processes · Game Theory and Voting Systems
