GMoE: Empowering LLMs Fine-Tuning via MoE Graph Collaboration

Ting Bai; Yue Yu; Le Huang; Zenan Xu; Chuan Shi

arXiv:2412.16216·cs.LG·November 25, 2025

GMoE: Empowering LLMs Fine-Tuning via MoE Graph Collaboration

Ting Bai, Yue Yu, Le Huang, Zenan Xu, Chuan Shi

PDF

Open Access

TL;DR

This paper introduces GMoE, a graph-based MoE framework that improves expert collaboration and stability in LLM fine-tuning, addressing load imbalance issues with novel routing and coordination strategies.

Contribution

GMoE presents a new graph router and coordination strategies for MoE, enhancing expert collaboration and stability during LLM fine-tuning with parameter-efficient methods.

Findings

01

GMoE outperforms baseline models on multiple benchmarks.

02

The graph routing improves expert collaboration.

03

Coordination strategies increase model stability.

Abstract

The sparse Mixture-of-Experts (MoE) architecture of large language models (LLMs) confronts an inherent issue of load imbalance arising from the simplistic linear router strategy, which ultimately causes the instability and inefficient learning of LLMs. To address this challenge, we introduce a novel MoE graph-based framework $GMoE$ , aimed at enhancing the collaboration among multiple experts. In GMoE, a graph router function is designed to capture the collaboration signals among experts. This enables all experts to dynamically allocate information derived from input data by sharing information with their neighboring experts. Moreover, we put forward two coordination strategies in GMoE: the $Poisson distribution-based distinction strategy$ and the $Normal distribution-based balance strategy$ , to further release the capacity of each expert and increase the model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Data Mining Algorithms and Applications · Biomedical Text Mining and Ontologies

MethodsMixture of Experts