Save It All: Enabling Full Parameter Tuning for Federated Large Language   Models via Cycle Block Gradient Descent

Lin Wang; Zhichao Wang; Xiaoying Tang

arXiv:2406.11187·cs.LG·July 22, 2024

Save It All: Enabling Full Parameter Tuning for Federated Large Language Models via Cycle Block Gradient Descent

Lin Wang, Zhichao Wang, Xiaoying Tang

PDF

Open Access 1 Repo

TL;DR

This paper introduces FedCyBGD, a novel federated learning method for large language models that enables full parameter tuning with minimal resource use by using cycle block gradient descent and a compression scheme.

Contribution

The paper presents FedCyBGD, a new approach that allows full parameter training of LLMs in federated learning with reduced communication, computation, and memory costs.

Findings

01

Achieves state-of-the-art performance in federated LLM training.

02

Reduces communication and resource costs significantly.

03

Enables full parameter tuning in federated settings.

Abstract

The advent of large language models (LLMs) has revolutionized the deep learning paradigm, yielding impressive results across a wide array of tasks. However, the pre-training or fine-tuning of LLMs within a federated learning (FL) framework poses substantial challenges, including considerable computational and memory resource demands, as well as communication bottlenecks between servers and clients. Existing solutions either make the unrealistic assumption that the entire model is exchanged for training, or apply parameter-effective fine-tuning methods from centralized learning to train LLMs in FL which tend to underperform during training or fine-tuning stages due to the limited search subspace of parameter updating. In this paper, we introduce a novel method for the efficient training and fine-tuning of LLMs in FL, with minimal resource consumption. Our approach, termed FedCyBGD,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

L3030/FedCyBGD
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data