Multivariate Variational Autoencoder

Mehmet Can Yavuz

arXiv:2511.07472·cs.LG·December 2, 2025

Multivariate Variational Autoencoder

Mehmet Can Yavuz

PDF

Open Access

TL;DR

The paper introduces the Multivariate Variational Autoencoder (MVAE), a full-covariance extension of VAEs that models correlated latent factors, improving calibration, disentanglement, and reconstruction quality across multiple datasets.

Contribution

It proposes a tractable multivariate Gaussian posterior for VAEs, enabling correlated latent variables while maintaining simple optimization and evaluation procedures.

Findings

01

MVAE outperforms diagonal VAEs in calibration and clustering metrics.

02

Provides smoother, more coherent latent traversals.

03

Achieves better or comparable reconstruction quality across datasets.

Abstract

Learning latent representations that are simultaneously expressive, geometrically well-structured, and reliably calibrated remains a central challenge for Variational Autoencoders (VAEs). Standard VAEs typically assume a diagonal Gaussian posterior, which simplifies optimization but rules out correlated uncertainty and often yields entangled or redundant latent dimensions. We introduce the Multivariate Variational Autoencoder (MVAE), a tractable full-covariance extension of the VAE that augments the encoder with sample-specific diagonal scales and a global coupling matrix. This induces a multivariate Gaussian posterior of the form $N (μ_{ϕ} (x), C diag (σ_{ϕ}^{2} (x)) C^{⊤})$ , enabling correlated latent factors while preserving a closed-form KL divergence and a simple reparameterization path. Beyond likelihood, we propose a multi-criterion evaluation protocol that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Domain Adaptation and Few-Shot Learning