WARP-Q: Quality Prediction For Generative Neural Speech Codecs

Wissam A. Jassim; Jan Skoglund; Michael Chinen; Andrew Hines

arXiv:2102.10449·eess.AS·February 23, 2021·ICASSP

WARP-Q: Quality Prediction For Generative Neural Speech Codecs

Wissam A. Jassim, Jan Skoglund, Michael Chinen, Andrew Hines

PDF

2 Repos

TL;DR

WARP-Q is a new objective speech quality metric that accurately predicts the quality of generative neural speech codecs, outperforming traditional models in correlation and ranking across various codecs and noise conditions.

Contribution

The paper introduces WARP-Q, a novel full-reference speech quality metric tailored for generative neural speech codecs, addressing limitations of existing models.

Findings

01

WARP-Q shows higher correlation with subjective quality assessments.

02

It effectively ranks codecs and is robust to perceptual signal changes.

03

WARP-Q outperforms traditional metrics like POLQA and ViSQOL.

Abstract

Good speech quality has been achieved using waveform matching and parametric reconstruction coders. Recently developed very low bit rate generative codecs can reconstruct high quality wideband speech with bit streams less than 3 kb/s. These codecs use a DNN with parametric input to synthesise high quality speech outputs. Existing objective speech quality models (e.g., POLQA, ViSQOL) do not accurately predict the quality of coded speech from these generative models underestimating quality due to signal differences not highlighted in subjective listening tests. We present WARP-Q, a full-reference objective speech quality metric that uses dynamic time warping cost for MFCC speech representations. It is robust to small perceptual signal changes. Evaluation using waveform matching, parametric and generative neural vocoder based codecs as well as channel and environmental noise shows that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.