Speaker independence of neural vocoders and their effect on parametric resynthesis speech enhancement
Soumi Maiti, Michael I Mandel

TL;DR
This paper demonstrates that neural vocoders, when trained on diverse speakers, can generate high-quality speech for unseen speakers, significantly improving parametric resynthesis speech enhancement over existing methods.
Contribution
It shows that speaker-dependent neural vocoders can be trained for speaker independence and used for superior speech enhancement on unseen speakers.
Findings
Neural vocoders trained on multiple speakers can generate speech for unseen speakers with high quality.
Parametric resynthesis with these vocoders outperforms state-of-the-art speech enhancement systems.
Multi-speaker PR surpasses oracle Wiener mask in subjective quality.
Abstract
Traditional speech enhancement systems produce speech with compromised quality. Here we propose to use the high quality speech generation capability of neural vocoders for better quality speech enhancement. We term this parametric resynthesis (PR). In previous work, we showed that PR systems generate high quality speech for a single speaker using two neural vocoders, WaveNet and WaveGlow. Both these vocoders are traditionally speaker dependent. Here we first show that when trained on data from enough speakers, these vocoders can generate speech from unseen speakers, both male and female, with similar quality as seen speakers in training. Next using these two vocoders and a new vocoder LPCNet, we evaluate the noise reduction quality of PR on unseen speakers and show that objective signal and overall quality is higher than the state-of-the-art speech enhancement systems Wave-U-Net,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMixture of Logistic Distributions · Affine Coupling · Normalizing Flows · Invertible 1x1 Convolution · WaveGlow · Dilated Causal Convolution · WaveNet
