TL;DR
This paper introduces a mathematically-based calibration algorithm for panel assessments that accounts for assessor biases and confidence levels, improving the accuracy of object evaluations in various decision-making contexts.
Contribution
A novel graph-based calibration algorithm that incorporates assessor confidence levels to infer true object values and standardize scores.
Findings
Algorithm effectively calibrates assessor scores.
Outperforms simple averaging and Fisher's method.
Validated through simulations and real data.
Abstract
Frequently, a set of objects has to be evaluated by a panel of assessors, but not every object is assessed by every assessor. A problem facing such panels is how to take into account different standards amongst panel members and varying levels of confidence in their scores. Here, a mathematically-based algorithm is developed to calibrate the scores of such assessors, addressing both of these issues. The algorithm is based on the connectivity of the graph of assessors and objects evaluated, incorporating declared confidences as weights on its edges. If the graph is sufficiently well connected, relative standards can be inferred by comparing how assessors rate objects they assess in common, weighted by the levels of confidence of each assessment. By removing these biases, "true" values are inferred for all the objects. Reliability estimates for the resulting values are obtained. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
