Confidant: Customizing Transformer-based LLMs via Collaborative Edge   Training

Yuhao Chen; Yuxuan Yan; Qianqian Yang; Yuanchao Shu; Shibo He; Jiming; Chen

arXiv:2311.13381·cs.LG·November 23, 2023·2 cites

Confidant: Customizing Transformer-based LLMs via Collaborative Edge Training

Yuhao Chen, Yuxuan Yan, Qianqian Yang, Yuanchao Shu, Shibo He, Jiming, Chen

PDF

Open Access

TL;DR

Confidant enables efficient customization of large language models on mobile devices by partitioning models, optimizing hardware utilization, and accelerating training and inference, making LLM deployment on edge devices feasible.

Contribution

This paper introduces Confidant, a novel framework for collaborative training of LLMs on mobile devices through model partitioning and hardware-aware scheduling.

Findings

01

Achieves up to 45.3% memory reduction.

02

Provides 8.03x inference speedup.

03

Enables effective LLM customization on mobile devices.

Abstract

Transformer-based large language models (LLMs) have demonstrated impressive capabilities in a variety of natural language processing (NLP) tasks. Nonetheless, it is challenging to deploy and fine-tune LLMs on mobile edge devices with limited computing, memory, and energy budgets. In this paper, we propose Confidant, a multi-backend collaborative training framework for customizing state-of-the-art LLMs on commodity mobile devices like smartphones. Confidant partitions an LLM into several sub-models so that each fits into a mobile device's memory. A pipeline parallel training mechanism is further developed to ensure fast and efficient distributed training. In addition, we propose a novel backend scheduler to allocate different attention heads to heterogeneous compute hardware, including mobile CPU and GPUs, to maximize the compute resource utilization on each edge device. Our preliminary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech Recognition and Synthesis · Natural Language Processing Techniques