Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs

Thierry Bossy; Julien Vignoud; Tahseen Rabbani; Juan R. Troncoso Pastoriza; Martin Jaggi

arXiv:2502.05087·cs.LG·March 10, 2026

Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs

Thierry Bossy, Julien Vignoud, Tahseen Rabbani, Juan R. Troncoso Pastoriza, Martin Jaggi

PDF

Open Access 1 Repo

TL;DR

This paper shows that applying LoRA fine-tuning in federated learning significantly reduces memorization of sensitive data in large language models across various domains and model sizes, enhancing privacy without sacrificing performance.

Contribution

It introduces the use of LoRA in federated learning to mitigate memorization, demonstrating substantial privacy improvements across multiple model sizes and high-risk domains.

Findings

01

LoRA reduces memorization by up to 10 times in FL models.

02

LoRA decreases memorization in models from 1B to 70B parameters.

03

Combining LoRA with other privacy techniques further enhances privacy.

Abstract

Federated learning (FL) is a popular paradigm for collaborative training which avoids direct data exposure between clients. However, data privacy issues still remain: FL-trained large language models are capable of memorizing and completing phrases and sentences contained in training data when given their prefixes. Thus, it is possible for adversarial and honest- but-curious clients to recover training data of other participants simply through targeted prompting. In this work, we demonstrate that a popular and simple fine-tuning strategy, low-rank adaptation (LoRA), reduces memorization during FL by a factor of up to 10 without significant performance cost. We study this effect by performing fine-tuning tasks in high-risk domains such as medicine, law, and finance. We observe a reduction in memorization for a wide variety of model families, from 1B to 70B parameters. We find that LoRA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tuneinsight/federated-llms
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Advanced Data Storage Technologies

MethodsGradient Clipping · LLaMA