Projected Autoregression: Autoregressive Language Generation in Continuous State Space

Oshri Naparstek

arXiv:2601.04854·cs.CL·April 7, 2026

Projected Autoregression: Autoregressive Language Generation in Continuous State Space

Oshri Naparstek

PDF

TL;DR

This paper introduces Projected Autoregression, a novel autoregressive language generation method that predicts in continuous embedding space with delayed discrete token commitment, offering a new perspective on language modeling.

Contribution

It proposes replacing token selection with continuous prediction in embedding space, enabling iterative refinement and exposing a continuous control surface for language generation.

Findings

01

Continuous prediction yields distinct text structure and dynamics.

02

The method outperforms token-space autoregressive baselines in compute-matched reranking.

03

It reveals a continuous control surface influencing generation before token commitment.

Abstract

Standard autoregressive language models generate text by repeatedly selecting a discrete next token, coupling prediction with irreversible commitment at every step. We show that token selection is not the only viable autoregressive interface. \textbf{Projected Autoregression} replaces token selection with continuous prediction in embedding space followed by discrete projection at commitment time. The model predicts next-token vectors via regression and contrastive objectives, while discrete tokens arise only by nearest-neighbor projection. An optional mutable suffix (``liquid tail'') enables iterative refinement before commitment, but the central change is more basic: next-step prediction is continuous, and discrete tokens are produced only as a downstream interface. Projected Autoregression establishes a concrete alternative to token-selection autoregression: language generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.