Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Xinyu Yang, Yuwei An, Hongyi Liu, Tianqi Chen, Beidi Chen

TL;DR
Multiverse introduces a parallel generative model for language models, enabling faster, more efficient reasoning and generation by internalizing a MapReduce paradigm, with open-sourced tools and models.
Contribution
We propose Multiverse, a novel parallel generation framework for language models, including new data curation, attention mechanisms, and inference systems, achieving performance comparable to autoregressive models.
Findings
Multiverse-32B matches AR-LLMs performance after minimal fine-tuning.
Multiverse outperforms AR-LLMs in scaling and efficiency.
Open-sourced ecosystem facilitates research and deployment.
Abstract
Autoregressive Large Language Models (AR-LLMs) frequently exhibit implicit parallelism in sequential generation. Inspired by this, we introduce Multiverse, a new generative model that enables natively parallel generation. Multiverse internalizes a MapReduce paradigm, generating automatically through three stages: (i) a Map stage for adaptive task decomposition, (ii) a Process stage for parallel subtask execution, and (iii) a Reduce stage for lossless result synthesis. Next, we build a real-world Multiverse reasoning model with co-design of data, algorithm, and system, enabling rapid and seamless transfer from frontier AR-LLMs. For data creation, we develop Multiverse Curator, an automated LLM-assisted pipeline that transforms sequential reasoning chains into structured training data, avoiding costly human annotations. Algorithmically, we design Multiverse Attention to separate parallel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Big Data and Digital Economy
MethodsSoftmax · Attention Is All You Need
