LanFL: Differentially Private Federated Learning with Large Language Models using Synthetic Samples
Huiyu Wu, Diego Klabjan

TL;DR
LanFL introduces a privacy-preserving federated learning approach for large language models that uses synthetic samples and prompt optimization, enabling collaborative learning without sharing sensitive data or model weights.
Contribution
The paper presents LanFL, a novel prompt-based federated learning scheme for LLMs that employs differentially private synthetic data generation and operates as a black-box, addressing computational and privacy challenges.
Findings
LanFL effectively enables collaborative learning among participants.
The method preserves privacy of local datasets.
Experiments show successful learning across various tasks.
Abstract
Federated Learning (FL) is a collaborative, privacy-preserving machine learning framework that enables multiple participants to train a single global model. However, the recent advent of powerful Large Language Models (LLMs) with tens to hundreds of billions of parameters makes the naive application of traditional FL methods to LLMs impractical due to high computational and communication costs. Furthermore, end users of LLMs often lack access to full architectures and weights of the models, making it impossible for participants to fine-tune these models directly. This paper introduces a novel FL scheme for LLMs, named LanFL, which is purely prompt-based and treats the underlying LLMs as black boxes. We have developed a differentially private synthetic sample generation mechanism to facilitate knowledge sharing among participants, along with a prompt optimization scheme that enables…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
