Piper: Efficient Large-Scale MoE Training via Resource Modeling and Pipelined Hybrid Parallelism

Sajal Dash; Feiyi Wang

arXiv:2605.05049·cs.DC·May 7, 2026

Piper: Efficient Large-Scale MoE Training via Resource Modeling and Pipelined Hybrid Parallelism

Sajal Dash, Feiyi Wang

PDF

TL;DR

Piper is a framework that improves large-scale MoE model training efficiency on HPC platforms by using resource modeling and pipelined hybrid parallelism, leading to significant performance gains.

Contribution

It introduces a resource-aware training strategy with pipeline parallelism for MoE models, addressing memory, communication, and workload imbalance challenges.

Findings

01

Piper achieves 2-3.5X higher MFU than X-MoE.

02

A new all-to-all algorithm delivers 1.2-9X bandwidth improvements.

03

The framework effectively mitigates communication and workload bottlenecks.

Abstract

Frontier models increasingly adopt Mixture-of-Experts (MoE) architectures to achieve large-model performance at reduced cost. However, training MoE models on HPC platforms is hindered by large memory footprints, frequent large-scale communication across heterogeneous networks, and severe workload imbalance. To characterize these challenges, we develop a mathematical model that quantifies memory, compute, and communication requirements for MoE configurations under various parallelization schemes, verified through micro-benchmarking, code instrumentation, and hardware profiling. Our analysis identifies performance bottlenecks: all-to-all latency at scale from expert parallelism, insufficient compute-communication overlap, low GPU utilization from imbalanced skinny GEMMs, and the absence of platform-aware hybrid parallelization strategies. To address these, we introduce Piper, a framework…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.