Theoretical Convergence Guarantees for Variational Autoencoders

Sobihan Surendran (LPSM (UMR\_8001)); Antoine Godichon-Baggioni (LPSM (UMR\_8001)); Sylvain Le Corff (LPSM (UMR\_8001); SU)

arXiv:2410.16750·stat.ML·December 23, 2025

Theoretical Convergence Guarantees for Variational Autoencoders

Sobihan Surendran (LPSM (UMR\_8001)), Antoine Godichon-Baggioni (LPSM (UMR\_8001)), Sylvain Le Corff (LPSM (UMR\_8001), SU)

PDF

TL;DR

This paper provides the first non-asymptotic convergence guarantees for Variational Autoencoders trained with common optimization algorithms, clarifying their theoretical properties and influencing future research.

Contribution

It derives explicit convergence rates for VAE training with SGD and Adam, applicable to various VAE models and variants, bridging theory and practice.

Findings

01

Convergence rate of $\\mathcal{O}(\log n / \sqrt{n})$ established.

02

Explicit dependencies on batch size and hyperparameters identified.

03

Empirical analysis shows hyperparameters significantly affect convergence.

Abstract

Variational Autoencoders (VAE) are popular generative models used to sample from complex data distributions. Despite their empirical success in various machine learning tasks, significant gaps remain in understanding their theoretical properties, particularly regarding convergence guarantees. This paper aims to bridge that gap by providing non-asymptotic convergence guarantees for VAE trained using both Stochastic Gradient Descent and Adam algorithms. We derive a convergence rate of $O (lo g n / n)$ , where $n$ is the number of iterations of the optimization algorithm, with explicit dependencies on the batch size, the number of variational samples, and other key hyperparameters. Our theoretical analysis applies to both Linear VAE and Deep Gaussian VAE, as well as several VAE variants, including $β$ -VAE and IWAE. Additionally, we empirically illustrate the impact of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAdam