Privacy in Fine-tuning Large Language Models: Attacks, Defenses, and   Future Directions

Hao Du; Shang Liu; Lele Zheng; Yang Cao; Atsuyoshi Nakamura; Lei Chen

arXiv:2412.16504·cs.AI·April 8, 2025·2 cites

Privacy in Fine-tuning Large Language Models: Attacks, Defenses, and Future Directions

Hao Du, Shang Liu, Lele Zheng, Yang Cao, Atsuyoshi Nakamura, Lei Chen

PDF

Open Access

TL;DR

This paper surveys privacy risks in fine-tuning large language models, discusses attack methods and defenses, and suggests future research directions to enhance privacy preservation without compromising utility.

Contribution

It provides a comprehensive overview of privacy challenges, evaluates existing defense mechanisms, and identifies research gaps for developing better privacy-preserving fine-tuning techniques.

Findings

01

Membership inference and data extraction attacks pose significant privacy risks.

02

Current defenses like differential privacy and federated learning have limitations.

03

Future directions include improving defense effectiveness and balancing privacy with utility.

Abstract

Fine-tuning has emerged as a critical process in leveraging Large Language Models (LLMs) for specific downstream tasks, enabling these models to achieve state-of-the-art performance across various domains. However, the fine-tuning process often involves sensitive datasets, introducing privacy risks that exploit the unique characteristics of this stage. In this paper, we provide a comprehensive survey of privacy challenges associated with fine-tuning LLMs, highlighting vulnerabilities to various privacy attacks, including membership inference, data extraction, and backdoor attacks. We further review defense mechanisms designed to mitigate privacy risks in the fine-tuning phase, such as differential privacy, federated learning, and knowledge unlearning, discussing their effectiveness and limitations in addressing privacy risks and maintaining model utility. By identifying key gaps in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data