Geometry-Preserving Encoder/Decoder in Latent Generative Models
Wonjun Lee, Riley C.W. O'Neill, Dongmian Zou, Jeff Calder, Gilad Lerman

TL;DR
This paper introduces a geometry-preserving encoder/decoder framework for latent generative models, offering theoretical convergence guarantees and improved training efficiency over traditional VAEs.
Contribution
The paper proposes a novel geometry-preserving encoder/decoder with theoretical convergence guarantees, enhancing training efficiency in latent diffusion models.
Findings
Faster convergence of decoder training with the new encoder
Theoretical proofs of convergence for encoder training
Advantages over traditional VAEs in preserving data geometry
Abstract
Generative modeling aims to generate new data samples that resemble a given dataset, with diffusion models recently becoming the most popular generative model. One of the main challenges of diffusion models is solving the problem in the input space, which tends to be very high-dimensional. Recently, solving diffusion models in the latent space through an encoder that maps from the data space to a lower-dimensional latent space has been considered to make the training process more efficient and has shown state-of-the-art results. The variational autoencoder (VAE) is the most commonly used encoder/decoder framework in this domain, known for its ability to learn latent representations and generate data samples. In this paper, we introduce a novel encoder/decoder framework with theoretical properties distinct from those of the VAE, specifically designed to preserve the geometric structure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies
MethodsDiffusion
