Rho-Perfect: Correlation Ceiling For Subjective Evaluation Datasets
Fredrik Cumlin

TL;DR
This paper introduces $ ho$-Perfect, a method to estimate the maximum possible correlation between models and human ratings in subjective datasets, accounting for inherent noise and data reliability.
Contribution
It provides a practical estimation technique for the correlation ceiling in subjective datasets, addressing the impact of heteroscedastic noise on model-human correlation.
Findings
$ ho$-Perfect accurately estimates the correlation ceiling in subjective datasets.
The method distinguishes between model limitations and data quality issues.
Application to speech quality data demonstrates its practical utility.
Abstract
Subjective ratings contain inherent noise that limits the model-human correlation, but this reliability issue is rarely quantified. In this paper, we present -Perfect, a practical estimation of the highest achievable correlation of a model on subjectively rated datasets. We define -Perfect to be the correlation between a perfect predictor and human ratings, and derive an estimate of the value based on heteroscedastic noise scenarios, a common occurrence in subjectively rated datasets. We show that -Perfect squared estimates test-retest correlation and use this to validate the estimate. We demonstrate the use of -Perfect on a speech quality dataset and show how the measure can distinguish between model limitations and data quality issues.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Face recognition and analysis
