ARD-VAE: A Statistical Formulation to Find the Relevant Latent   Dimensions of Variational Autoencoders

Surojit Saha; Sarang Joshi; Ross Whitaker

arXiv:2501.10901·cs.LG·January 28, 2025

ARD-VAE: A Statistical Formulation to Find the Relevant Latent Dimensions of Variational Autoencoders

Surojit Saha, Sarang Joshi, Ross Whitaker

PDF

Open Access

TL;DR

This paper introduces ARD-VAE, a statistical method that automatically identifies relevant latent dimensions in VAEs by using a hierarchical prior, improving model interpretability and performance without manual hyperparameter tuning.

Contribution

The paper proposes a hierarchical prior in VAEs to automatically discover relevant latent factors, replacing the need for manual latent dimension selection.

Findings

01

Effectively identifies relevant latent dimensions across datasets

02

Improves evaluation metrics like FID score and disentanglement

03

Reduces reliance on trial-and-error hyperparameter tuning

Abstract

The variational autoencoder (VAE) is a popular, deep, latent-variable model (DLVM) due to its simple yet effective formulation for modeling the data distribution. Moreover, optimizing the VAE objective function is more manageable than other DLVMs. The bottleneck dimension of the VAE is a crucial design choice, and it has strong ramifications for the model's performance, such as finding the hidden explanatory factors of a dataset using the representations learned by the VAE. However, the size of the latent dimension of the VAE is often treated as a hyperparameter estimated empirically through trial and error. To this end, we propose a statistical formulation to discover the relevant latent factors required for modeling a dataset. In this work, we use a hierarchical prior in the latent space that estimates the variance of the latent axes using the encoded data, which identifies the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis