Your 2 is My 1, Your 3 is My 9: Handling Arbitrary Miscalibrations in Ratings
Jingyan Wang, Nihar B. Shah

TL;DR
This paper introduces estimators that effectively recover true scores from arbitrarily miscalibrated ratings, outperforming ranking-only methods without assuming specific miscalibration models.
Contribution
It challenges the belief that only rankings matter by developing estimators that utilize cardinal scores regardless of miscalibration complexity.
Findings
Estimators outperform ranking-only methods under arbitrary miscalibrations.
Proposed methods are flexible for various applications like A/B testing and ranking.
The approach provides new insights into the use of cardinal versus ordinal data.
Abstract
Cardinal scores (numeric ratings) collected from people are well known to suffer from miscalibrations. A popular approach to address this issue is to assume simplistic models of miscalibration (such as linear biases) to de-bias the scores. This approach, however, often fares poorly because people's miscalibrations are typically far more complex and not well understood. In the absence of simplifying assumptions on the miscalibration, it is widely believed by the crowdsourcing community that the only useful information in the cardinal scores is the induced ranking. In this paper, inspired by the framework of Stein's shrinkage, empirical Bayes, and the classic two-envelope problem, we contest this widespread belief. Specifically, we consider cardinal scores with arbitrary (or even adversarially chosen) miscalibrations which are only required to be consistent with the induced ranking. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · Forecasting Techniques and Applications · Organizational Management and Leadership
