PocketLLM: Enabling On-Device Fine-Tuning for Personalized LLMs
Dan Peng, Zhihui Fu, Jun Wang

TL;DR
This paper introduces PocketLLM, a method using derivative-free optimization to enable on-device fine-tuning of large language models on mobile devices, preserving privacy and overcoming resource constraints.
Contribution
It presents a novel approach employing derivative-free optimization for on-device LLM fine-tuning, demonstrating feasibility on mobile hardware.
Findings
Fine-tuned RoBERTa-large on mobile with 4GB memory
Fine-tuned OPT-1.3B on mobile with 6.5GB memory
Shows potential for personalized LLMs on resource-limited devices
Abstract
Recent advancements in large language models (LLMs) have indeed showcased their impressive capabilities. On mobile devices, the wealth of valuable, non-public data generated daily holds great promise for locally fine-tuning personalized LLMs, while maintaining privacy through on-device processing. However, the constraints of mobile device resources pose challenges to direct on-device LLM fine-tuning, mainly due to the memory-intensive nature of derivative-based optimization required for saving gradients and optimizer states. To tackle this, we propose employing derivative-free optimization techniques to enable on-device fine-tuning of LLM, even on memory-limited mobile devices. Empirical results demonstrate that the RoBERTa-large model and OPT-1.3B can be fine-tuned locally on the OPPO Reno 6 smartphone using around 4GB and 6.5GB of memory respectively, using derivative-free…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies
