NVCiM-PT: An NVCiM-assisted Prompt Tuning Framework for Edge LLMs

Ruiyang Qin; Pengyu Ren; Zheyu Yan; Liu Liu; Dancheng Liu; Amir; Nassereldine; Jinjun Xiong; Kai Ni; Sharon Hu; Yiyu Shi

arXiv:2411.08244·cs.LG·November 14, 2024

NVCiM-PT: An NVCiM-assisted Prompt Tuning Framework for Edge LLMs

Ruiyang Qin, Pengyu Ren, Zheyu Yan, Liu Liu, Dancheng Liu, Amir, Nassereldine, Jinjun Xiong, Kai Ni, Sharon Hu, Yiyu Shi

PDF

Open Access

TL;DR

This paper introduces NVCiM-PT, a novel prompt tuning framework for edge LLMs that leverages non-volatile computing-in-memory architectures to enhance resource efficiency and address domain shift issues.

Contribution

It proposes the first NVCiM-assisted prompt tuning framework specifically designed for resource-constrained edge LLMs, focusing on in-situ matrix operations.

Findings

01

NVCiM-PT improves prompt tuning efficiency on edge devices.

02

The framework effectively addresses domain shift in edge LLMs.

03

In-situ matrix multiplication accelerates prompt tuning processes.

Abstract

Large Language Models (LLMs) deployed on edge devices, known as edge LLMs, need to continuously fine-tune their model parameters from user-generated data under limited resource constraints. However, most existing learning methods are not applicable for edge LLMs because of their reliance on high resources and low learning capacity. Prompt tuning (PT) has recently emerged as an effective fine-tuning method for edge LLMs by only modifying a small portion of LLM parameters, but it suffers from user domain shifts, resulting in repetitive training and losing resource efficiency. Conventional techniques to address domain shift issues often involve complex neural networks and sophisticated training, which are incompatible for PT for edge LLMs. Therefore, an open research question is how to address domain shift issues for edge LLMs with limited resources. In this paper, we propose a prompt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVLSI and Analog Circuit Testing