TL;DR
This paper introduces a spherical flow approach for sampling categorical data by leveraging the von Mises-Fisher distribution on the sphere, enabling improved generative modeling of discrete sequences.
Contribution
It develops a novel spherical flow framework using vMF distributions, reducing the continuity equation to a scalar ODE, and demonstrates improved sampling results over existing methods.
Findings
vMF-based path outperforms geodesic and Euclidean methods in experiments.
The approach significantly improves Sudoku and language modeling results.
Posterior-based velocity and score computation enable effective ODE and PC sampling.
Abstract
We study the problem of learning generative models for discrete sequences in a continuous embedding space. Whereas prior approaches typically operate in Euclidean space or on the probability simplex, we instead work on the sphere . There the von Mises-Fisher (vMF) distribution induces a natural noise process and admits a closed-form conditional score. The conditional velocity is in general intractable. Exploiting the radial symmetry of the vMF density we reduce the continuity equation on to a scalar ODE in the cosine similarity, whose unique bounded solution determines the velocity. The marginal velocity and marginal score on both decompose into posterior-weighted tangent sums that differ only by per-token scalar weights. This gives access to both ODE and predictor-corrector (PC) sampling. The posterior is the only learned object,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
