Mixture of Heterogeneous Grouped Experts for Language Modeling

Zhicheng Ma; Xiang Liu; Zhaoxiang Liu; Ning Wang; Yi Shen; Kai Wang; Shuming Shi; Shiguo Lian

arXiv:2604.23108·cs.CL·April 29, 2026

Mixture of Heterogeneous Grouped Experts for Language Modeling

Zhicheng Ma, Xiang Liu, Zhaoxiang Liu, Ning Wang, Yi Shen, Kai Wang, Shuming Shi, Shiguo Lian

PDF

1 Repo

TL;DR

This paper introduces MoHGE, a resource-efficient mixture of heterogeneous grouped experts for language modeling, balancing performance, parameter efficiency, and GPU utilization.

Contribution

It proposes a novel two-level routing and auxiliary loss mechanisms to improve resource efficiency and load balancing in heterogeneous MoE architectures.

Findings

01

MoHGE matches MoE performance with 20% fewer parameters.

02

It achieves balanced GPU utilization during inference.

03

The approach reduces deployment costs in real-world scenarios.

Abstract

Large Language Models (LLMs) based on Mixture-of-Experts (MoE) are pivotal in industrial applications for their ability to scale performance efficiently. However, standard MoEs enforce uniform expert sizes,creating a rigidity that fails to align computational costs with varying token-level complexity. While heterogeneous expert architectures attempt to address this by diversifying expert sizes, they often suffer from significant system-level challenges, specifically unbalanced GPU utilization and inefficient parameter utilization, which hinder practical deployment. To bridge the gap between theoretical heterogeneity and robust industrial application, we propose Mixture of Heterogeneous Grouped Experts (MoHGE) which introduces a two-level routing mechanism to enable flexible, resource-aware expert combinations. To optimize inference efficiency, we propose a Group-Wise Auxiliary Loss,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

UnicomAI/MoHGE
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.