Performance of Objective Speech Quality Metrics on Languages Beyond Validation Data: A Study of Turkish and Korean
Javier Perez, Dimme de Groot, Jorge Martinez

TL;DR
This study evaluates how well two speech quality metrics, PESQ and ViSQOL, perform on Turkish and Korean, revealing language-dependent biases and emphasizing the need for multilingual datasets.
Contribution
It highlights the limitations of existing speech quality metrics on unseen languages and provides empirical evidence of their biased performance on Turkish and Korean.
Findings
Turkish samples have higher ViSQOL scores.
Correlation between PESQ and ViSQOL is highest for Turkish male speakers.
Performance varies significantly across languages and speaker genders.
Abstract
Objective speech quality measures are widely used to assess the performance of video conferencing platforms and telecommunication systems. They predict human-rated speech quality and are crucial for assessing the systems quality of experience. Despite the widespread use, the quality measures are developed on a limited set of languages. This can be problematic since the performance on unseen languages is consequently not guaranteed or even studied. Here we raise awareness to this issue by investigating the performance of two objective speech quality measures (PESQ and ViSQOL) on Turkish and Korean. Using English as baseline, we show that Turkish samples have significantly higher ViSQOL scores and that for Turkish male speakers the correlation between PESQ and ViSQOL is highest. These results highlight the need to explore biases across metrics and to develop a labeled speech quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Speech and Audio Processing · Advanced Data Compression Techniques
MethodsSparse Evolutionary Training
