CELLM: An Efficient Communication in Large Language Models Training for Federated Learning
Raja Vavekanand, Kira Sam

TL;DR
This paper introduces CELLM, a method combining low-rank adaptation and sparse communication to efficiently train large language models in federated learning, significantly reducing communication costs while maintaining high utility.
Contribution
CELLM is the first approach to effectively integrate LoRA and sparse updates for federated LLM training, addressing communication and computation bottlenecks.
Findings
Reduces communication costs by up to 10x compared to vanilla LoRA.
Achieves up to 5x reduction over complex sparse LoRA baselines.
Maintains or improves model utility with optimized sparsity and rank configurations.
Abstract
Federated Learning (FL) is a recent model training paradigm in which client devices collaboratively train a model without ever aggregating their data. Crucially, this scheme offers users potential privacy and security benefits by only ever communicating updates to the model weights to a central server as opposed to traditional machine learning (ML) training which directly communicates and aggregates data. However, FL training suffers from statistical heterogeneity as clients may have differing local data distributions. Large language models (LLMs) offer a potential solution to this issue of heterogeneity given that they have consistently been shown to be able to learn on vast amounts of noisy data. While LLMs are a promising development for resolving the consistent issue of non-I.I.D. Clients in federated settings exacerbate two other bottlenecks in FL: limited local computing and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
