Split Fine-Tuning for Large Language Models in Wireless Networks
Songge Zhang, Guoliang Cheng, Xinyu Huang, Zuguang Li, Wen Wu,, Lingyang Song, and Xuemin Shen

TL;DR
This paper introduces Split Fine-Tuning (SFT), an efficient method for adapting large language models in wireless networks by splitting models, compressing data, and optimizing resource management to reduce delay and communication costs.
Contribution
The paper proposes a novel split fine-tuning scheme with joint optimization of compression and resource allocation for LLMs in wireless environments.
Findings
Reduces fine-tuning delay by up to 80.2%.
Decreases communication overhead by 93.6%.
Maintains model accuracy within constraints.
Abstract
Fine-tuning is the process of adapting the pre-trained large language models (LLMs) for downstream tasks. Due to substantial parameters, fine-tuning LLMs on mobile devices demands considerable memory resources, and suffers from high communication overhead and long fine-tuning delay. In this paper, we propose an efficient LLM fine-tuning scheme in wireless networks, named Split Fine-Tuning (SFT), which can accommodate LLM fine-tuning on mobile devices. Specifically, an LLM is split into a server-side part on the edge server and a device-side part on the mobile device to satisfy the device-side memory constraint. All devices share a server-side model and perform parallel fine-tuning to reduce fine-tuning delay. In addition, to reduce significant communication overhead incurred by data exchange between devices and the edge server, we propose a data compression scheme by jointly leveraging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Recommender Systems and Techniques · Context-Aware Activity Recognition Systems
