WDMoE: Wireless Distributed Large Language Models with Mixture of Experts
Nan Xue, Yaping Sun, Zhiyong Chen, Meixia Tao, Xiaodong Xu, Liang, Qian, Shuguang Cui, Ping Zhang

TL;DR
This paper introduces WDMoE, a wireless distributed large language model framework using Mixture of Experts, which enhances performance and reduces latency by leveraging edge server and device collaboration in wireless systems.
Contribution
The paper proposes a novel wireless distributed LLM paradigm with expert distribution and a dynamic expert selection policy to improve efficiency and stability in wireless environments.
Findings
WDMoE outperforms existing models like Llama 2 in accuracy.
It significantly reduces end-to-end latency in wireless LLM deployment.
The approach demonstrates robustness across various datasets and LLMs.
Abstract
Large Language Models (LLMs) have achieved significant success in various natural language processing tasks, but how wireless communications can support LLMs has not been extensively studied. In this paper, we propose a wireless distributed LLMs paradigm based on Mixture of Experts (MoE), named WDMoE, deploying LLMs collaboratively across edge servers of base station (BS) and mobile devices in the wireless communications system. Specifically, we decompose the MoE layer in LLMs by deploying the gating network and the preceding neural network layer at BS, while distributing the expert networks across the devices. This arrangement leverages the parallel capabilities of expert networks on distributed devices. Moreover, to overcome the instability of wireless communications, we design an expert selection policy by taking into account both the performance of the model and the end-to-end…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Expert finding and Q&A systems
MethodsBalanced Selection · LLaMA · Mixture of Experts
