Differentially Private Low-Rank Adaptation of Large Language Model Using   Federated Learning

Xiao-Yang Liu; Rongyi Zhu; Daochen Zha; Jiechao Gao; Shan Zhong; Matt; White; Meikang Qiu

arXiv:2312.17493·cs.LG·June 4, 2024·1 cites

Differentially Private Low-Rank Adaptation of Large Language Model Using Federated Learning

Xiao-Yang Liu, Rongyi Zhu, Daochen Zha, Jiechao Gao, Shan Zhong, Matt, White, Meikang Qiu

PDF

Open Access 2 Repos

TL;DR

This paper presents DP-LoRA, a federated learning method for large language models that ensures data privacy through noise addition and reduces communication costs with low-rank adaptation, validated on diverse datasets.

Contribution

Introduction of DP-LoRA, a novel federated learning algorithm combining differential privacy and low-rank adaptation for efficient, privacy-preserving LLM fine-tuning.

Findings

01

DP-LoRA maintains strict privacy with Gaussian noise addition.

02

It significantly reduces communication overhead during training.

03

Experimental results show effective privacy preservation across datasets.

Abstract

The surge in interest and application of large language models (LLMs) has sparked a drive to fine-tune these models to suit specific applications, such as finance and medical science. However, concerns regarding data privacy have emerged, especially when multiple stakeholders aim to collaboratively enhance LLMs using sensitive data. In this scenario, federated learning becomes a natural choice, allowing decentralized fine-tuning without exposing raw data to central servers. Motivated by this, we investigate how data privacy can be ensured in LLM fine-tuning through practical federated learning approaches, enabling secure contributions from multiple parties to enhance LLMs. Yet, challenges arise: 1) despite avoiding raw data exposure, there is a risk of inferring sensitive information from model outputs, and 2) federated learning for LLMs incurs notable communication overhead. To address…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Artificial Intelligence in Healthcare and Education