Dynamical Variational Autoencoders: A Comprehensive Review

Laurent Girin; Simon Leglaive; Xiaoyu Bie; Julien Diard and; Thomas Hueber; Xavier Alameda-Pineda

arXiv:2008.12595·cs.LG·July 5, 2022

Dynamical Variational Autoencoders: A Comprehensive Review

Laurent Girin, Simon Leglaive, Xiaoyu Bie, Julien Diard and, Thomas Hueber, Xavier Alameda-Pineda

PDF

1 Repo

TL;DR

This paper reviews and unifies various dynamical variational autoencoder models that extend VAEs to sequential data, providing a comprehensive overview, reimplementation, and benchmark results on speech analysis-resynthesis.

Contribution

It introduces a general class of dynamical VAEs, standardizes their notation, and benchmarks seven recent models on a speech task.

Findings

01

Reimplemented seven DVAE models for comparison.

02

Benchmark results on speech analysis-resynthesis.

03

Discussion on future directions for DVAE research.

Abstract

Variational autoencoders (VAEs) are powerful deep generative models widely used to represent high-dimensional complex data through a low-dimensional latent space learned in an unsupervised manner. In the original VAE model, the input data vectors are processed independently. Recently, a series of papers have presented different extensions of the VAE to process sequential data, which model not only the latent space but also the temporal dependencies within a sequence of data vectors and corresponding latent vectors, relying on recurrent neural networks or state-space models. In this paper, we perform a literature review of these models. We introduce and discuss a general class of models, called dynamical variational autoencoders (DVAEs), which encompasses a large subset of these temporal VAE extensions. Then, we present in detail seven recently proposed DVAE models, with an aim to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

XiaoyuBIE1994/DVAE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsUSD Coin Customer Service Number +1-833-534-1729 · Solana Customer Service Number +1-833-534-1729