Cosmos: Compressed and Smooth Latent Space for Text Diffusion Modeling

Viacheslav Meshchaninov; Egor Chimbulatov; Alexander Shabalin; Aleksandr Abramov; Dmitry Vetrov

arXiv:2506.21170·cs.CL·January 6, 2026

Cosmos: Compressed and Smooth Latent Space for Text Diffusion Modeling

Viacheslav Meshchaninov, Egor Chimbulatov, Alexander Shabalin, Aleksandr Abramov, Dmitry Vetrov

PDF

Open Access

TL;DR

Cosmos introduces a novel compressed, smooth latent space for text diffusion, enabling faster, high-quality text generation across multiple tasks by combining autoencoding and semantic alignment.

Contribution

It presents a new latent space for diffusion models that reduces dimensionality and improves efficiency, with a training method that aligns autoencoder representations with pretrained language models.

Findings

01

Compression of text representations by 8x without quality loss

02

Cosmos surpasses baselines with longer latent sequences

03

Achieves over 2x faster inference while maintaining quality

Abstract

Autoregressive language models dominate modern text generation, yet their sequential nature introduces fundamental limitations: decoding is slow, and maintaining global coherence remains challenging. Diffusion models offer a promising alternative by enabling parallel generation and flexible control; however, their application to text generation is hindered by the high dimensionality of token-level representations. We introduce Cosmos, a novel approach to text generation that operates entirely in a compressed, smooth latent space tailored specifically for diffusion. This space is learned using an autoencoder trained simultaneously for token-level reconstruction and alignment with frozen activations from a pretrained language encoder, providing robust semantic grounding and enabling effective perturbation-based augmentations. Empirically, we demonstrate that text representations can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsDiffusion