Resource-Efficient Personal Large Language Models Fine-Tuning with Collaborative Edge Computing
Shengyuan Ye, Bei Ouyang, Tianyi Qian, Liekang Zeng, Jingyi Li, Jiangsu Du, Xiaowen Chu, Guoliang Xing, Xu Chen

TL;DR
This paper introduces PAC, a resource-efficient collaborative edge computing framework for personal LLM fine-tuning, significantly reducing time and memory requirements while leveraging nearby edge devices.
Contribution
The paper presents PAC, a novel algorithm-system co-designed framework that enables efficient, distributed fine-tuning of personal LLMs on edge devices, overcoming resource constraints.
Findings
Achieves up to 8.64x speedup in fine-tuning
Reduces memory footprint by up to 88.16%
Outperforms state-of-the-art methods in efficiency
Abstract
Large language models (LLMs) have unlocked a plethora of powerful applications at the network edge, such as intelligent personal assistants. Data privacy and security concerns have prompted a shift towards edge-based fine-tuning of personal LLMs, away from cloud reliance. However, this raises issues of computational intensity and resource scarcity, hindering training efficiency and feasibility. While current studies investigate parameter-efficient fine-tuning (PEFT) techniques to mitigate resource constraints, our analysis indicates that these techniques are not sufficiently resource-efficient for edge devices. To tackle these challenges, we propose Pluto and Charon (PAC), a time and memory efficient collaborative edge AI framework for personal LLMs fine-tuning. PAC breaks the resource wall of personal LLMs fine-tuning with a sophisticated algorithm-system co-design. (1)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management
