Federated Attention: A Distributed Paradigm for Collaborative LLM Inference over Edge Networks
Xiumei Deng, Zehui Xiong, Binbin Chen, Dong In Kim, Merouane Debbah, H. Vincent Poor

TL;DR
Federated Attention (FedAttn) introduces a distributed LLM inference framework that enhances privacy, reduces communication, and improves computational efficiency by integrating federated learning principles into self-attention mechanisms at the edge.
Contribution
This work presents FedAttn, a novel distributed inference method for LLMs that combines federated learning with self-attention, enabling privacy-preserving, efficient collaborative inference.
Findings
Theoretical analysis of error propagation in FedAttn.
Trade-off characterization between response quality and efficiency.
Experimental validation showing scalability and optimization opportunities.
Abstract
Large language models (LLMs) are proliferating rapidly at the edge, delivering intelligent capabilities across diverse application scenarios. However, their practical deployment in collaborative scenarios confronts fundamental challenges: privacy vulnerabilities, communication overhead, and computational bottlenecks. To address these, we propose Federated Attention (FedAttn), which integrates the federated paradigm into the self-attention mechanism, creating a new distributed LLM inference framework that simultaneously achieves privacy protection, communication efficiency, and computational efficiency. FedAttn enables participants to perform local self-attention over their own token representations while periodically exchanging and aggregating Key-Value (KV) matrices across multiple Transformer blocks, collaboratively generating LLM responses without exposing private prompts. Further,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Big Data and Digital Economy
