SplitLLM: Hierarchical Split Learning for Large Language Model over   Wireless Network

Songge Zhang; Guoliang Cheng; Zuguang Li; and Wen Wu

arXiv:2501.13318·cs.DC·January 24, 2025

SplitLLM: Hierarchical Split Learning for Large Language Model over Wireless Network

Songge Zhang, Guoliang Cheng, Zuguang Li, and Wen Wu

PDF

Open Access

TL;DR

SplitLLM introduces a hierarchical split learning approach for fine-tuning large language models over wireless networks, significantly reducing memory usage and communication overhead while enabling personalized services.

Contribution

The paper proposes a novel hierarchical split learning scheme that partitions LLM and LoRA adapters across cloud, edge, and user sides for efficient wireless network training.

Findings

01

Reduces peak memory usage by up to 74%

02

Enables parallel training of multiple users and edge servers

03

Improves communication efficiency in wireless LLM fine-tuning

Abstract

Fine-tuning a large language model (LLM) using the local data of edge users can enable personalized services and applications. For privacy protection, the prevalent solution adopts distributed learning for fine-tuning and integrates low-rank adaptation (LoRA) to reduce users' computational load. However, as the number of users increases, numerous users simultaneously communicate with the server, and multiple server-side models concurrently execute on the server, leading to significant communication congestion and memory pressure. In this paper, we propose a split learning (SL) scheme for fine-tuning LLM in wireless networks, which involves one cloud server, a small number of edge servers, and multiple users. Specifically, the pre-trained model and LoRA adapters are divided into three parts and deployed across the cloud, edge, and user sides. The training process follows the sequence of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis