Generative Speech Coding with Predictive Variance Regularization

W. Bastiaan Kleijn; Andrew Storus; Michael Chinen; Tom Denton; Felicia; S. C. Lim; Alejandro Luebs; Jan Skoglund; Hengchin Yeh

arXiv:2102.09660·eess.AS·February 22, 2021

Generative Speech Coding with Predictive Variance Regularization

W. Bastiaan Kleijn, Andrew Storus, Michael Chinen, Tom Denton, Felicia, S. C. Lim, Alejandro Luebs, Jan Skoglund, Hengchin Yeh

PDF

1 Repo

TL;DR

This paper introduces predictive-variance regularization to improve generative speech coding, significantly enhancing performance at low bit rates by reducing sensitivity to outliers and noise.

Contribution

It proposes a novel regularization method for generative speech models, addressing outlier sensitivity and demonstrating state-of-the-art performance at 3 kb/s.

Findings

01

Significant performance improvement with regularization

02

Effective noise reduction boosts coding quality

03

Achieves state-of-the-art results at low bit rates

Abstract

The recent emergence of machine-learning based generative models for speech suggests a significant reduction in bit rate for speech codecs is possible. However, the performance of generative models deteriorates significantly with the distortions present in real-world input signals. We argue that this deterioration is due to the sensitivity of the maximum likelihood criterion to outliers and the ineffectiveness of modeling a sum of independent signals with a single autoregressive model. We introduce predictive-variance regularization to reduce the sensitivity to outliers, resulting in a significant increase in performance. We show that noise reduction to remove unwanted signals can significantly increase performance. We provide extensive subjective performance evaluations that show that our system based on generative modeling provides state-of-the-art coding performance at 3 kb/s for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google/lyra
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.