eFedLLM: Efficient LLM Inference Based on Federated Learning
Shengwen Ding, Chenhui Hu

TL;DR
This paper presents eFedLLM, a federated learning approach that enhances the efficiency and accessibility of large language model inference through model parallelism, incentive mechanisms, and matrix optimization techniques.
Contribution
It introduces a novel federated learning framework with model parallelism and incentive mechanisms to improve LLM inference efficiency and security.
Findings
Significant reduction in computational and memory requirements.
Enhanced collaborative training of LLMs among resource-limited users.
Improved model robustness through incentive mechanisms.
Abstract
Large Language Models (LLMs) herald a transformative era in artificial intelligence (AI). However, the expansive scale of data and parameters of LLMs requires high-demand computational and memory resources, restricting their accessibility to a broader range of users and researchers. This paper introduces an effective approach that enhances the operational efficiency and affordability of LLM inference. By utilizing transformer-based federated learning (FL) with model-parallel distributed training, our model efficiently distributes the computational loads and memory requirements across a network of participants. This strategy permits users, especially those with limited resources to train state-of-the-art LLMs collaboratively. We also innovate an incentive mechanism within the FL framework, rewarding constructive contributions and filtering out malicious activities, thereby safeguarding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Blockchain Technology Applications and Security · Data Mining Algorithms and Applications
