Evo: Autoregressive-Diffusion Large Language Models with Evolving Balance
Junde Wu, Minhao Hu, Jiayuan Zhu, Yuyuan Liu, Tianyi Zhang, Kang Li, Jingkun Chen, Jiazhen Pan, Min Xu, Yueming Jin

TL;DR
Evo introduces a unified latent trajectory model that seamlessly combines autoregressive and diffusion-based language generation, enabling adaptive, efficient, and high-quality text synthesis across diverse tasks.
Contribution
The paper presents Evo, a novel duality latent trajectory model that unifies autoregressive and diffusion models within a continuous evolutionary framework, with a shared probability flow and end-to-end training.
Findings
Achieves state-of-the-art results on multiple benchmarks.
Maintains fast inference speed while improving quality.
Demonstrates robust reasoning and code generation capabilities.
Abstract
We introduce \textbf{Evo}, a duality latent trajectory model that bridges autoregressive (AR) and diffusion-based language generation within a continuous evolutionary generative framework. Rather than treating AR decoding and diffusion generation as separate paradigms, Evo reconceptualizes text generation as a latent flow: each token is associated with a vector-valued embedding that evolves over a progression variable , indicating its semantic maturity. Low values correspond to confident AR-like refinement, while high values invoke diffusion-style planning, allowing the model to adaptively balance AR and diffusion based on uncertainty. Theoretically, we show that both AR and diffusion models emerge as discretizations of a shared probability flow, and we derive Evo's training objective from a unified variational ELBO. The model is implemented as a time-conditioned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Topic Modeling · Language and cultural evolution
