NISQA: A Deep CNN-Self-Attention Model for Multidimensional Speech   Quality Prediction with Crowdsourced Datasets

Gabriel Mittag; Babak Naderi; Assmaa Chehadi; Sebastian M\"oller

arXiv:2104.09494·eess.AS·December 15, 2021

NISQA: A Deep CNN-Self-Attention Model for Multidimensional Speech Quality Prediction with Crowdsourced Datasets

Gabriel Mittag, Babak Naderi, Assmaa Chehadi, Sebastian M\"oller

PDF

1 Repo

TL;DR

This paper introduces NISQA, a deep learning model utilizing CNN and self-attention for multidimensional speech quality prediction, trained on extensive crowdsourced datasets, providing detailed quality insights and reliable predictions on real-world telephone call recordings.

Contribution

The paper presents an updated end-to-end NISQA model with self-attention for detailed speech quality assessment, trained on new large-scale datasets and evaluated on real-world data.

Findings

01

NISQA effectively predicts overall speech quality and four specific dimensions.

02

The model generalizes well to unseen speech samples from diverse datasets.

03

Open-sourced code and datasets facilitate further research.

Abstract

In this paper, we present an update to the NISQA speech quality prediction model that is focused on distortions that occur in communication networks. In contrast to the previous version, the model is trained end-to-end and the time-dependency modelling and time-pooling is achieved through a Self-Attention mechanism. Besides overall speech quality, the model also predicts the four speech quality dimensions Noisiness, Coloration, Discontinuity, and Loudness, and in this way gives more insight into the cause of a quality degradation. Furthermore, new datasets with over 13,000 speech files were created for training and validation of the model. The model was finally tested on a new, live-talking test dataset that contains recordings of real telephone calls. Overall, NISQA was trained and evaluated on 81 datasets from different sources and showed to provide reliable predictions also for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gabrielmittag/NISQA
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.