Study on the Correlation between Objective Evaluations and Subjective Speech Quality and Intelligibility
Hsin-Tien Chiang, Kuo-Hsuan Hung, Szu-Wei Fu, Heng-Cheng Kuo,, Ming-Hsueh Tsai, Yu Tsao

TL;DR
This paper investigates how well existing objective speech quality and intelligibility measures align with subjective human assessments, proposing deep learning-based combined measures that improve prediction accuracy and reduce training data needs.
Contribution
It introduces a novel deep learning approach that combines current objective measures to better predict subjective speech quality and intelligibility, with insights into their relationship.
Findings
Deep learning-based combined measures improve prediction accuracy.
Including subjective ratings enhances intelligibility prediction.
Proposed models reduce training data requirements.
Abstract
Subjective tests are the gold standard for evaluating speech quality and intelligibility; however, they are time-consuming and expensive. Thus, objective measures that align with human perceptions are crucial. This study evaluates the correlation between commonly used objective measures and subjective speech quality and intelligibility using a Chinese speech dataset. Moreover, new objective measures are proposed that combine current objective measures using deep learning techniques to predict subjective quality and intelligibility. The proposed deep learning model reduces the amount of training data without significantly affecting prediction performance. We analyzed the deep learning model to understand how objective measures reflect subjective quality and intelligibility. We also explored the impact of including subjective speech quality ratings on speech intelligibility prediction.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Ultrasonics and Acoustic Wave Propagation · Hearing Loss and Rehabilitation
MethodsALIGN
