Language Modeling with Hyperspherical Flows

Justin Deschenaux; Caglar Gulcehre

arXiv:2605.11125·cs.LG·May 19, 2026

Language Modeling with Hyperspherical Flows

Justin Deschenaux, Caglar Gulcehre

PDF

1 Models

TL;DR

The paper introduces $\\mathbb{S}$-FLM, a hyperspherical flow language model that improves sequence generation efficiency and semantic interpretability over traditional flow models, especially in reasoning tasks.

Contribution

It proposes a novel latent FLM in hyperspherical space that generates sequences via rotations, reducing computational overhead and enhancing performance in reasoning tasks.

Findings

01

$\\mathbb{S}$-FLM improves large-vocabulary reasoning performance.

02

It closes the gap to masked diffusion models at standard temperature.

03

It remains less effective at low-temperature decoding.

Abstract

Discrete Diffusion Language Models progressed rapidly as an alternative to autoregressive (AR) models, motivated by their parallel generation abilities. However, for tractability, discrete diffusion models sample from a factorized distribution, which is less expressive than AR. Recent Flow Language Models (FLMs) apply continuous flows to language, transporting noise to data with a deterministic ODE that avoids factorized sampling. FLMs operate on one-hot vectors whose dimension scales with the vocabulary size, making FLMs costly to train. Moreover, since all distinct one-hot embeddings are equidistant in $ℓ_{2}$ , adding Gaussian noise does not have a clear semantic interpretation (unlike images, where Gaussian noise progressively degrades structure). We introduce $S$ -FLM, a latent FLM in the hypersphere. $S$ -FLM generates sequences by rotating vectors in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
jdeschena/s-flm
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.