Navigating the Designs of Privacy-Preserving Fine-tuning for Large   Language Models

Haonan Shi; Tu Ouyang; An Wang

arXiv:2501.04323·cs.LG·January 22, 2025

Navigating the Designs of Privacy-Preserving Fine-tuning for Large Language Models

Haonan Shi, Tu Ouyang, An Wang

PDF

Open Access

TL;DR

This paper introduces GuardedTuning, a framework of privacy-preserving fine-tuning designs for large language models that balance utility, privacy, and cost, supported by evaluation metrics and experimental validation.

Contribution

It systematically explores and evaluates multiple novel architectures combining privacy methods and computation techniques for effective privacy-preserving LLM fine-tuning.

Findings

01

GuardedTuning designs protect against data reconstruction attacks.

02

They maintain competitive fine-tuning performance.

03

The designs offer diverse trade-offs between privacy, utility, and costs.

Abstract

Instruction tuning has proven effective in enhancing Large Language Models' (LLMs) performance on downstream tasks. However, real-world fine-tuning faces inherent conflicts between model providers' intellectual property protection, clients' data privacy requirements, and tuning costs. While recent approaches like split learning and offsite tuning demonstrate promising architectures for privacy-preserving fine-tuning, there is a gap in systematically addressing the multidimensional trade-offs required for diverse real-world deployments. We propose several indicative evaluation metrics to guide design trade-offs for privacy-preserving fine-tuning and a series of example designs, collectively named GuardedTuning; they result from novel combinations of system architectures with adapted privacy-enhancement methods and emerging computation techniques. Each design represents distinct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data