Embedding Inversion via Conditional Masked Diffusion Language Models

Han Xiao

arXiv:2602.11047·cs.CL·February 19, 2026

Embedding Inversion via Conditional Masked Diffusion Language Models

Han Xiao

PDF

Open Access 1 Datasets

TL;DR

This paper introduces a novel method for embedding inversion using conditional masked diffusion, enabling efficient parallel token recovery without encoder access or iterative correction, significantly improving over traditional autoregressive approaches.

Contribution

The authors propose a new conditional masked diffusion approach for embedding inversion that recovers all tokens simultaneously with minimal inference passes and no encoder access.

Findings

01

Achieves token recovery with only 8 forward passes.

02

Operates without encoder access or iterative correction.

03

Effective across multiple embedding models on 32-token sequences.

Abstract

We frame embedding inversion as conditional masked diffusion, recovering all tokens in parallel through iterative denoising rather than sequential autoregressive generation. A masked diffusion language model is conditioned on the target embedding via adaptive layer normalization, requiring only 8 forward passes with no access to the target encoder at inference time. On 32-token sequences across three embedding models, the method achieves token recovery through parallel denoising without requiring encoder access, iterative correction, or architecture-specific alignment. Source code and live demo are available at https://github.com/jina-ai/embedding-inversion-demo.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

doannv/mi-research
dataset

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Speech Recognition and Synthesis · Topic Modeling