Reconstruction of Personally Identifiable Information from Supervised Finetuned Models

Sae Furukawa; Alina Oprea

arXiv:2605.12264·cs.CR·May 13, 2026

Reconstruction of Personally Identifiable Information from Supervised Finetuned Models

Sae Furukawa, Alina Oprea

PDF

TL;DR

This paper investigates the privacy risks of supervised finetuned language models by evaluating their potential to leak personally identifiable information (PII) using new datasets and a novel decoding algorithm, COVA.

Contribution

It introduces realistic datasets for PII leakage evaluation and proposes COVA, a decoding method that enhances PII reconstruction from finetuned models.

Findings

01

Partial attacker knowledge greatly improves PII reconstruction success.

02

Leakage varies significantly across different PII types.

03

COVA outperforms existing extraction methods in reconstructing PII.

Abstract

Supervised Finetuning (SFT) has become one of the primary methods for adapting a large language model (LLM) with extensive pre-trained knowledge to domain-specific, instruction-following tasks. SFT datasets, composed of instruction-response pairs, often include user-provided information that may contain sensitive data such as personally identifiable information (PII), raising privacy concerns. This paper studies the problem of PII reconstruction from SFT models for the first time. We construct multi-turn, user-centric Q&A datasets in sensitive domains, specifically medical and legal settings, that incorporate PII to enable realistic evaluation of leakage. Using these datasets, we evaluate the extent to which an adversary, with varying levels of knowledge about the fine-tuning dataset, can infer sensitive information about individuals whose data was used during SFT. In the reconstruction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.