Communication-Efficient Federated Fine-Tuning
Michael Theologitis, Vasilis Samoladas, Antonios Deligiannakis

TL;DR
This paper introduces FDA-Opt, a new algorithm family for federated fine-tuning of large language models, reducing communication costs and outperforming existing methods without extra configuration.
Contribution
FDA-Opt unifies FDA and FedOpt algorithms, offering a practical, hyper-parameter-free approach for efficient federated fine-tuning of language models.
Findings
FDA-Opt outperforms FedOpt in experiments on NLP tasks.
FDA-Opt requires no additional configuration and is a drop-in replacement.
Experimental results show superior performance of FDA-Opt over existing methods.
Abstract
Federated Learning (FL) enables the utilization of vast, previously inaccessible data sources. At the same time, pre-trained Language Models (LMs) have taken the world by storm and for good reason. They exhibit remarkable emergent abilities and are readily adapted to downstream tasks. This opens one of the most exciting frontiers in FL: fine-tuning LMs. Yet, a persistent challenge in FL is the frequent, rigid communication of parameters -- a problem magnified by the sheer size of these contemporary models. The FedOpt family of algorithms has become the go-to approach for FL, relying on fixed but arbitrary intervals for model exchanges. Recently, the FDA algorithm prescribed a dynamic approach by monitoring the training progress. However, it introduced a hard-to-calibrate parameter and imposed a rigid synchronization scheme. In this work, we address these limitations by proposing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
