FedsLLM: Federated Split Learning for Large Language Models over   Communication Networks

Kai Zhao,Zhaohui Yang,Chongwen Huang,Xiaoming Chen,Zhaoyang Zhang

arXiv:2407.09250·cs.NI·July 15, 2024

FedsLLM: Federated Split Learning for Large Language Models over Communication Networks

Kai Zhao,Zhaohui Yang,Chongwen Huang,Xiaoming Chen,Zhaoyang Zhang

PDF

Open Access

TL;DR

This paper introduces FedsLLM, a federated split learning framework for large language models that reduces training delay over wireless networks by optimizing communication and computation using LoRA technology.

Contribution

It combines LoRA with splitfed learning to efficiently train large language models in wireless environments, providing an optimization approach that significantly reduces delays.

Findings

01

Training delay reduced by 47.63% on average

02

Optimization simplifies to a convex problem for efficient solutions

03

Framework effectively balances computation and communication loads

Abstract

Addressing the challenges of deploying large language models in wireless communication networks, this paper combines low-rank adaptation technology (LoRA) with the splitfed learning framework to propose the federated split learning for large language models (FedsLLM) framework. The method introduced in this paper utilizes LoRA technology to reduce processing loads by dividing the network into client subnetworks and server subnetworks. It leverages a federated server to aggregate and update client models. As the training data are transmitted through a wireless network between clients and both main and federated servers, the training delay is determined by the learning accuracy and the allocation of communication bandwidth. This paper models the minimization of the training delay by integrating computation and communication optimization, simplifying the optimization problem into a convex…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data