ParaBlock: Communication-Computation Parallel Block Coordinate Federated Learning for Large Language Models
Yujia Wang, Yuanpu Cao, Jinghui Chen

TL;DR
ParaBlock introduces a parallel communication-computation scheme for federated block coordinate training of large language models, significantly reducing communication latency while maintaining convergence and performance.
Contribution
It proposes a novel parallel approach for federated block coordinate descent in LLMs, improving communication efficiency without sacrificing convergence.
Findings
Achieves the same convergence rate as standard methods.
Significantly reduces communication latency.
Maintains strong performance in instruction following and reasoning tasks.
Abstract
Federated learning (FL) has been extensively studied as a privacy-preserving training paradigm. Recently, federated block coordinate descent scheme has become a popular option in training large-scale models, as it allows clients to train only a subset of the model locally instead of the entire model. However, in the era of large language models (LLMs), even a single block can contain a significant number of parameters, posing substantial communication latency, particularly for resource-constrained clients. To address this challenge in federated training/fine-tuning LLMs, we propose ParaBlock, a novel approach that establishes two parallel threads for communication and computation to enhance communication efficiency. We theoretically prove that the proposed ParaBlock achieves the same convergence rate as the standard federated block coordinate descent methods. Empirical evaluations on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Big Data and Digital Economy · Advanced Graph Neural Networks
