Generative Modeling of Discrete Data Using Geometric Latent Subspaces
Daniel Gonzalez-Alvarado, Jonas Cassel, Stefania Petra, Christoph Schn\"orr

TL;DR
This paper introduces a geometric latent space framework for generative modeling of discrete data, leveraging Riemannian geometry to improve efficiency and accuracy.
Contribution
It develops a novel geometric approach with latent subspaces in exponential parameter space, enabling effective flow matching and dimensionality reduction for discrete data.
Findings
Low-dimensional latent spaces effectively model high-dimensional discrete data.
The geometric PCA (GPCA) method improves model training via Riemannian geometry.
Empirical results demonstrate accurate data modeling with reduced latent dimensions.
Abstract
We propose a geometric latent-subspace framework for generative modeling of discrete data. Specifically, we introduce latent subspaces in the exponential parameter space of product manifolds of categorical distributions as a novel method for learning generative models of discrete data. The resulting low-dimensional latent space encodes statistical dependencies and removes redundant degrees of freedom among the categorical variables. We equip the parameter domain with a Riemannian geometry such that the latent subspace and induced data manifold are related by isometries enabling consistent flow matching. Exploiting this structure, we propose a geometry-aware dimensionality reduction objective, called geometric PCA (GPCA), which we formulate as a regularized cross-entropy minimization that encourages small Riemannian distances between the data and their reconstructions. In particular,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
