Optimal Pairwise Comparison Procedures for Subjective Evaluation
Jack Webb, Lorenzo Picinali

TL;DR
This paper evaluates pairwise comparison methods for subjective audio quality assessment, proposing a new sampling procedure that converges faster and maintains accuracy, reducing the number of necessary comparisons.
Contribution
A novel sampling procedure for pairwise comparisons is introduced, outperforming existing methods in convergence speed while maintaining accuracy.
Findings
Bayesian sampling yields the most robust score estimates.
The proposed procedure converges fastest on the true ranking.
It requires fewer comparisons for accurate results.
Abstract
Audio signal processing algorithms are frequently assessed through subjective listening tests in which participants directly score degraded signals on a unidimensional numerical scale. However, this approach is susceptible to inconsistencies in scale calibration between assessors. Pairwise comparisons between degraded signals offer a more intuitive alternative, eliciting the relative scores of candidate signals with lower measurement error and reduced participant fatigue. Yet, due to the quadratic growth of the number of necessary comparisons, a complete set of pairwise comparisons becomes unfeasible for large datasets. This paper compares pairwise comparison procedures to identify the most efficient methods for approximating true quality scores with minimal comparisons. A novel sampling procedure is proposed and benchmarked against state-of-the-art methods on simulated datasets.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
