TEAM: Temporal-Spatial Consistency Guided Expert Activation for MoE Diffusion Language Model Acceleration

Linye Wei; Zixiang Luo; Pingzhi Tang; Meng Li

arXiv:2602.08404·cs.CL·February 10, 2026

TEAM: Temporal-Spatial Consistency Guided Expert Activation for MoE Diffusion Language Model Acceleration

Linye Wei, Zixiang Luo, Pingzhi Tang, Meng Li

PDF

Open Access

TL;DR

TEAM is a framework that accelerates MoE diffusion language models by exploiting temporal and spatial consistency in expert routing, reducing inference overhead while maintaining performance.

Contribution

The paper introduces TEAM, a novel expert activation strategy leveraging consistency properties to improve MoE diffusion model inference speed.

Findings

01

Achieves up to 2.2x speedup over vanilla MoE dLLMs.

02

Maintains negligible performance degradation.

03

Demonstrates effectiveness through extensive experiments.

Abstract

Diffusion large language models (dLLMs) have recently gained significant attention due to their inherent support for parallel decoding. Building on this paradigm, Mixture-of-Experts (MoE) dLLMs with autoregressive (AR) initialization have further demonstrated strong performance competitive with mainstream AR models. However, we identify a fundamental mismatch between MoE architectures and diffusion-based decoding. Specifically, a large number of experts are activated at each denoising step, while only a small subset of tokens is ultimately accepted, resulting in substantial inference overhead and limiting their deployment in latency-sensitive applications. In this work, we propose TEAM, a plug-and-play framework that accelerates MoE dLLMs by enabling more accepted tokens with fewer activated experts. TEAM is motivated by the observation that expert routing decisions exhibit strong…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis