Regularized Sequential Latent Variable Models with Adversarial Neural Networks
Jin Huang, Ming Xiao

TL;DR
This paper introduces a novel adversarial training approach for variational RNNs with latent variables, improving stability and posterior approximation for modeling complex sequential data like speech.
Contribution
It proposes a new adversarial training method for variational RNNs that enhances stability and theoretical optimality over existing approaches.
Findings
Convergence of reconstruction loss and evidence lower bound.
Improved posterior approximation through adversarial training.
Stable training process demonstrated on TIMIT speech data.
Abstract
The recurrent neural networks (RNN) with richly distributed internal states and flexible non-linear transition functions, have overtaken the dynamic Bayesian networks such as the hidden Markov models (HMMs) in the task of modeling highly structured sequential data. These data, such as from speech and handwriting, often contain complex relationships between the underlaying variational factors and the observed data. The standard RNN model has very limited randomness or variability in its structure, coming from the output conditional probability model. This paper will present different ways of using high level latent random variables in RNN to model the variability in the sequential data, and the training method of such RNN model under the VAE (Variational Autoencoder) principle. We will explore possible ways of using adversarial method to train a variational RNN model. Contrary to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Topic Modeling · Speech Recognition and Synthesis
MethodsVariational Inference
