Learning robust speech representation with an articulatory-regularized   variational autoencoder

Marc-Antoine Georges; Laurent Girin; Jean-Luc Schwartz; Thomas Hueber

arXiv:2104.03204·cs.SD·April 8, 2021

Learning robust speech representation with an articulatory-regularized variational autoencoder

Marc-Antoine Georges, Laurent Girin, Jean-Luc Schwartz, Thomas Hueber

PDF

TL;DR

This paper introduces an articulatory-regularized variational autoencoder that leverages articulatory parameters to improve speech representation learning, resulting in faster training, lower reconstruction loss, and enhanced speech denoising performance.

Contribution

It develops an articulatory model and integrates it into a VAE, demonstrating improved training efficiency and speech denoising compared to standard models.

Findings

01

Reduced training time and convergence loss

02

Enhanced speech denoising performance

03

Effective incorporation of articulatory features

Abstract

It is increasingly considered that human speech perception and production both rely on articulatory representations. In this paper, we investigate whether this type of representation could improve the performances of a deep generative model (here a variational autoencoder) trained to encode and decode acoustic speech features. First we develop an articulatory model able to associate articulatory parameters describing the jaw, tongue, lips and velum configurations with vocal tract shapes and spectral features. Then we incorporate these articulatory parameters into a variational autoencoder applied on spectral features by using a regularization technique that constraints part of the latent space to follow articulatory trajectories. We show that this articulatory constraint improves model training by decreasing time to convergence and reconstruction loss at convergence, and yields better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSolana Customer Service Number +1-833-534-1729