Scalable Factorized Hierarchical Variational Autoencoder Training

Wei-Ning Hsu; James Glass

arXiv:1804.03201·stat.ML·June 18, 2018

Scalable Factorized Hierarchical Variational Autoencoder Training

Wei-Ning Hsu, James Glass

PDF

2 Repos

TL;DR

This paper introduces a scalable hierarchical sampling training algorithm for factorized hierarchical variational autoencoders, enabling effective training on large-scale datasets for improved disentangled representation learning.

Contribution

The paper proposes a novel hierarchical sampling training algorithm that enhances scalability, reduces runtime and memory issues, and improves hyperparameter optimization for FHVAE models.

Findings

01

Effective training on datasets from 3 to 1,000 hours.

02

Models demonstrate improved disentanglement and interpretability.

03

Visualization method aids qualitative evaluation.

Abstract

Deep generative models have achieved great success in unsupervised learning with the ability to capture complex nonlinear relationships between latent generating factors and observations. Among them, a factorized hierarchical variational autoencoder (FHVAE) is a variational inference-based model that formulates a hierarchical generative process for sequential data. Specifically, an FHVAE model can learn disentangled and interpretable representations, which have been proven useful for numerous speech applications, such as speaker verification, robust speech recognition, and voice conversion. However, as we will elaborate in this paper, the training algorithm proposed in the original paper is not scalable to datasets of thousands of hours, which makes this model less applicable on a larger scale. After identifying limitations in terms of runtime, memory, and hyperparameter optimization,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsInterpretability · Solana Customer Service Number +1-833-534-1729