Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Xinyu Yang; Yuwei An; Hongyi Liu; Tianqi Chen; Beidi Chen

arXiv:2506.09991·cs.LG·June 16, 2025

Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Xinyu Yang, Yuwei An, Hongyi Liu, Tianqi Chen, Beidi Chen

PDF

Open Access 2 Models 1 Datasets

TL;DR

Multiverse introduces a parallel generative model for language models, enabling faster, more efficient reasoning and generation by internalizing a MapReduce paradigm, with open-sourced tools and models.

Contribution

We propose Multiverse, a novel parallel generation framework for language models, including new data curation, attention mechanisms, and inference systems, achieving performance comparable to autoregressive models.

Findings

01

Multiverse-32B matches AR-LLMs performance after minimal fine-tuning.

02

Multiverse outperforms AR-LLMs in scaling and efficiency.

03

Open-sourced ecosystem facilitates research and deployment.

Abstract

Autoregressive Large Language Models (AR-LLMs) frequently exhibit implicit parallelism in sequential generation. Inspired by this, we introduce Multiverse, a new generative model that enables natively parallel generation. Multiverse internalizes a MapReduce paradigm, generating automatically through three stages: (i) a Map stage for adaptive task decomposition, (ii) a Process stage for parallel subtask execution, and (iii) a Reduce stage for lossless result synthesis. Next, we build a real-world Multiverse reasoning model with co-design of data, algorithm, and system, enabling rapid and seamless transfer from frontier AR-LLMs. For data creation, we develop Multiverse Curator, an automated LLM-assisted pipeline that transforms sequential reasoning chains into structured training data, avoiding costly human annotations. Algorithmically, we design Multiverse Attention to separate parallel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Multiverse4FM/Multiverse-1K
dataset· 21 dl
21 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Big Data and Digital Economy

MethodsSoftmax · Attention Is All You Need