LASERS: LAtent Space Encoding for Representations with Sparsity for   Generative Modeling

Xin Li; Anand Sarwate

arXiv:2409.11184·cs.LG·September 18, 2024

LASERS: LAtent Space Encoding for Representations with Sparsity for Generative Modeling

Xin Li, Anand Sarwate

PDF

Open Access

TL;DR

This paper introduces a novel sparse, dictionary-based latent space representation for generative models, which improves expressiveness and reconstruction quality over traditional vector quantization methods by relaxing the assumption of inherent data discreteness.

Contribution

The authors propose a union of subspaces model with learned dictionaries and sparsity constraints for latent space representation, enhancing generative modeling performance.

Findings

01

Sparse latent representations outperform VQ in reconstruction quality.

02

The approach addresses codebook collapse issues in VQ models.

03

Latent space sparsity offers benefits beyond discretization, such as better expressiveness.

Abstract

Learning compact and meaningful latent space representations has been shown to be very useful in generative modeling tasks for visual data. One particular example is applying Vector Quantization (VQ) in variational autoencoders (VQ-VAEs, VQ-GANs, etc.), which has demonstrated state-of-the-art performance in many modern generative modeling applications. Quantizing the latent space has been justified by the assumption that the data themselves are inherently discrete in the latent space (like pixel values). In this paper, we propose an alternative representation of the latent space by relaxing the structural assumption than the VQ formulation. Specifically, we assume that the latent space can be approximated by a union of subspaces model corresponding to a dictionary-based representation under a sparsity constraint. The dictionary is learned/updated during the training process. We apply…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · 3D Surveying and Cultural Heritage