PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts
Yang Su, Na Yan, Yansha Deng, and Robert Schober

TL;DR
The paper introduces PWC-MoE, a privacy-aware, bandwidth-adaptive framework that dynamically routes tokens in wireless collaborative models to balance privacy, performance, and communication constraints.
Contribution
It proposes a novel gating and load-balancing mechanism for privacy-aware token routing in wireless LLM deployment, addressing privacy, efficiency, and scalability.
Findings
Effective privacy preservation in bandwidth-limited environments.
Maintains high model performance with adaptive token offloading.
Balances load among experts to prevent overloads.
Abstract
Large language models (LLMs) hosted on cloud servers alleviate the computational and storage burdens on local devices but raise privacy concerns due to sensitive data transmission and require substantial communication bandwidth, which is challenging in constrained environments. In contrast, small language models (SLMs) running locally enhance privacy but suffer from limited performance on complex tasks. To balance computational cost, performance, and privacy protection under bandwidth constraints, we propose a privacy-aware wireless collaborative mixture of experts (PWC-MoE) framework. Specifically, PWC-MoE employs a sparse privacy-aware gating network to dynamically route sensitive tokens to privacy experts located on local clients, while non-sensitive tokens are routed to non-privacy experts located at the remote base station. To achieve computational efficiency, the gating network…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsBalanced Selection
