On Active Privacy Auditing in Supervised Fine-tuning for White-Box Language Models
Qian Sun, Hanpeng Wu, Xi Sheryl Zhang

TL;DR
This paper introduces Parsing, an active privacy auditing framework that uses advanced white-box membership inference attacks to identify and quantify privacy leakage risks during supervised fine-tuning of language models, enhancing privacy safeguards.
Contribution
The paper presents a novel privacy auditing framework, Parsing, with improved attack techniques and a two-stage pipeline to monitor privacy risks in language model fine-tuning.
Findings
Effective privacy risk detection across various models
Significant privacy concerns identified during fine-tuning
Framework demonstrates high efficiency and reliability
Abstract
The pretraining and fine-tuning approach has become the leading technique for various NLP applications. However, recent studies reveal that fine-tuning data, due to their sensitive nature, domain-specific characteristics, and identifiability, pose significant privacy concerns. To help develop more privacy-resilient fine-tuning models, we introduce a novel active privacy auditing framework, dubbed Parsing, designed to identify and quantify privacy leakage risks during the supervised fine-tuning (SFT) of language models (LMs). The framework leverages improved white-box membership inference attacks (MIAs) as the core technology, utilizing novel learning objectives and a two-stage pipeline to monitor the privacy of the LMs' fine-tuning process, maximizing the exposure of privacy risks. Additionally, we have improved the effectiveness of MIAs on large LMs including GPT-2, Llama2, and certain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Warmup With Cosine Annealing · Linear Layer · Dense Connections · Layer Normalization · Adam · Attention Dropout · Multi-Head Attention
