Universal Preference-Score-based Pairwise Speech Quality Assessment

Yu-Fei Shi; Yang Ai; Zhen-Hua Ling

arXiv:2506.01455·cs.SD·June 3, 2025

Universal Preference-Score-based Pairwise Speech Quality Assessment

Yu-Fei Shi, Yang Ai, Zhen-Hua Ling

PDF

Open Access

TL;DR

This paper introduces UPPSQA, a universal model for pairwise speech quality assessment that predicts preference scores by estimating individual MOS and aggregating them, outperforming baselines across various scenarios.

Contribution

The paper presents a novel universal preference-score-based model for speech quality assessment that effectively predicts preference scores and handles data scarcity.

Findings

01

UPPSQA outperforms baseline models in accuracy.

02

The model is effective across different data types and domains.

03

A new pairwise speech dataset was constructed for experiments.

Abstract

To compare the performance of two speech generation systems, one of the most effective approaches is estimating the preference score between their generated speech. This paper proposes a novel universal preference-score-based pairwise speech quality assessment (UPPSQA) model, aimed at predicting the preference score between paired speech samples to determine which one has better quality. The model first predicts the absolute mean opinion score (MOS) for the two speech samples separately, and then aggregates them into a relative preference score using a preference function. To address the scarcity of preference data, we also construct a new pairwise speech dataset based on a MOS dataset for experiments. Experimental results confirm that, whether in training scenarios with different data types and label conditions, or in both in-domain and out-of-domain test scenarios, the prediction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing