Federated Learning-Enabled Hybrid Language Models for Communication-Efficient Token Transmission
Faranaksadat Solat, Joohyung Lee, Mohamed Seif, Dusit Niyato, and H. Vincent Poor

TL;DR
This paper introduces FedHLM, a federated learning framework for hybrid language models that significantly reduces communication overhead by collaboratively learning token uncertainty thresholds and enabling peer-to-peer token reuse, with minimal accuracy loss.
Contribution
FedHLM is the first to integrate uncertainty-aware inference with federated learning for hybrid language models, optimizing token-level thresholds and reducing LLM transmissions in a privacy-preserving manner.
Findings
Reduces LLM transmissions by over 95% in experiments.
Maintains high accuracy with negligible loss.
Enables peer-to-peer token reuse for efficiency.
Abstract
Hybrid Language Models (HLMs) combine the low-latency efficiency of Small Language Models (SLMs) on edge devices with the high accuracy of Large Language Models (LLMs) on centralized servers. Unlike traditional end-to-end LLM inference, HLMs reduce latency and communication by invoking LLMs only when local SLM predictions are uncertain, i.e., when token-level confidence is low or entropy is high. However, ambiguous or low-confidence predictions still require frequent offloading to the LLM, leading to significant communication overhead in bandwidth-constrained settings. To address this, we propose FedHLM, a communication-efficient HLM framework that integrates uncertainty-aware inference with Federated Learning (FL). FedHLM's key innovation lies in collaboratively learning token-level uncertainty thresholds that govern when LLM assistance is needed. Rather than using static or manually…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Privacy-Preserving Technologies in Data · Big Data and Digital Economy
