On Active Privacy Auditing in Supervised Fine-tuning for White-Box   Language Models

Qian Sun; Hanpeng Wu; Xi Sheryl Zhang

arXiv:2411.07070·cs.CL·November 13, 2024

On Active Privacy Auditing in Supervised Fine-tuning for White-Box Language Models

Qian Sun, Hanpeng Wu, Xi Sheryl Zhang

PDF

Open Access

TL;DR

This paper introduces Parsing, an active privacy auditing framework that uses advanced white-box membership inference attacks to identify and quantify privacy leakage risks during supervised fine-tuning of language models, enhancing privacy safeguards.

Contribution

The paper presents a novel privacy auditing framework, Parsing, with improved attack techniques and a two-stage pipeline to monitor privacy risks in language model fine-tuning.

Findings

01

Effective privacy risk detection across various models

02

Significant privacy concerns identified during fine-tuning

03

Framework demonstrates high efficiency and reliability

Abstract

The pretraining and fine-tuning approach has become the leading technique for various NLP applications. However, recent studies reveal that fine-tuning data, due to their sensitive nature, domain-specific characteristics, and identifiability, pose significant privacy concerns. To help develop more privacy-resilient fine-tuning models, we introduce a novel active privacy auditing framework, dubbed Parsing, designed to identify and quantify privacy leakage risks during the supervised fine-tuning (SFT) of language models (LMs). The framework leverages improved white-box membership inference attacks (MIAs) as the core technology, utilizing novel learning objectives and a two-stage pipeline to monitor the privacy of the LMs' fine-tuning process, maximizing the exposure of privacy risks. Additionally, we have improved the effectiveness of MIAs on large LMs including GPT-2, Llama2, and certain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Warmup With Cosine Annealing · Linear Layer · Dense Connections · Layer Normalization · Adam · Attention Dropout · Multi-Head Attention