MobiLLM: Enabling LLM Fine-Tuning on the Mobile Device via Server   Assisted Side Tuning

Liang Li; Xingke Yang; Wen Wu; Hao Wang; Tomoaki Ohtsuki; Xin Fu; Miao; Pan; Xuemin Shen

arXiv:2502.20421·cs.LG·March 3, 2025

MobiLLM: Enabling LLM Fine-Tuning on the Mobile Device via Server Assisted Side Tuning

Liang Li, Xingke Yang, Wen Wu, Hao Wang, Tomoaki Ohtsuki, Xin Fu, Miao, Pan, Xuemin Shen

PDF

TL;DR

MobiLLM enables mobile devices to fine-tune large language models efficiently by offloading training computations to a server, preserving privacy and reducing resource requirements.

Contribution

MobiLLM introduces a server-assisted side-tuning approach that separates adapters from the backbone, enabling memory-efficient on-device LLM fine-tuning without exposing data.

Findings

01

Enables CPU-only mobile devices to fine-tune LLMs.

02

Reduces memory usage and convergence time significantly.

03

Maintains data privacy during fine-tuning.

Abstract

Large Language Model (LLM) at mobile devices and its potential applications never fail to fascinate. However, on-device LLM fine-tuning poses great challenges due to extremely high memory requirements and slow training speeds. Even with parameter-efficient fine-tuning (PEFT) methods that update only a small subset of parameters, resource-constrained mobile devices cannot afford them. In this paper, we propose MobiLLM to enable memory-efficient transformer LLM fine-tuning on a mobile device via server-assisted side-tuning. Particularly, MobiLLM allows the resource-constrained mobile device to retain merely a frozen backbone model, while offloading the memory and computation-intensive backpropagation of a trainable side-network to a high-performance server. Unlike existing fine-tuning methods that keep trainable parameters inside the frozen backbone, MobiLLM separates a set of parallel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.