PrivateLoRA For Efficient Privacy Preserving LLM

Yiming Wang; Yu Lin; Xiaodong Zeng; Guannan Zhang

arXiv:2311.14030·cs.AI·November 27, 2023·5 cites

PrivateLoRA For Efficient Privacy Preserving LLM

Yiming Wang, Yu Lin, Xiaodong Zeng, Guannan Zhang

PDF

Open Access

TL;DR

PrivateLoRA introduces a privacy-preserving, communication-efficient framework for distributed LLM inference that maintains data locality and achieves high throughput on edge devices, enabling democratized access to advanced AI models.

Contribution

It proposes PrivateLoRA, a novel low-rank residual activation technique that significantly reduces communication overhead in distributed LLMs, ensuring privacy and efficiency.

Findings

01

Over 95% reduction in communication overhead.

02

Achieves 300% throughput of device-only solutions for 7B models.

03

Provides comparable tuning performance to LoRA for personalization.

Abstract

End users face a choice between privacy and efficiency in current Large Language Model (LLM) service paradigms. In cloud-based paradigms, users are forced to compromise data locality for generation quality and processing speed. Conversely, edge device paradigms maintain data locality but fail to deliver satisfactory performance. In this work, we propose a novel LLM service paradigm that distributes privacy-sensitive computation on edge devices and shared computation in the cloud. Only activations are transmitted between the central cloud and edge devices to ensure data locality. Our core innovation, PrivateLoRA, addresses the challenging communication overhead by exploiting the low rank of residual activations, achieving over 95% communication reduction. Consequently, PrivateLoRA effectively maintains data locality and is extremely resource efficient. Under standard 5G networks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Topic Modeling

Methodstravel james