Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs
Thierry Bossy, Julien Vignoud, Tahseen Rabbani, Juan R. Troncoso Pastoriza, Martin Jaggi

TL;DR
This paper shows that applying LoRA fine-tuning in federated learning significantly reduces memorization of sensitive data in large language models across various domains and model sizes, enhancing privacy without sacrificing performance.
Contribution
It introduces the use of LoRA in federated learning to mitigate memorization, demonstrating substantial privacy improvements across multiple model sizes and high-risk domains.
Findings
LoRA reduces memorization by up to 10 times in FL models.
LoRA decreases memorization in models from 1B to 70B parameters.
Combining LoRA with other privacy techniques further enhances privacy.
Abstract
Federated learning (FL) is a popular paradigm for collaborative training which avoids direct data exposure between clients. However, data privacy issues still remain: FL-trained large language models are capable of memorizing and completing phrases and sentences contained in training data when given their prefixes. Thus, it is possible for adversarial and honest- but-curious clients to recover training data of other participants simply through targeted prompting. In this work, we demonstrate that a popular and simple fine-tuning strategy, low-rank adaptation (LoRA), reduces memorization during FL by a factor of up to 10 without significant performance cost. We study this effect by performing fine-tuning tasks in high-risk domains such as medicine, law, and finance. We observe a reduction in memorization for a wide variety of model families, from 1B to 70B parameters. We find that LoRA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Advanced Data Storage Technologies
MethodsGradient Clipping · LLaMA
