PANORAMIA: Privacy Auditing of Machine Learning Models without   Retraining

Mishaal Kazmi; Hadrien Lautraite; Alireza Akbari; Qiaoyue Tang,; Mauricio Soroco; Tao Wang; S\'ebastien Gambs; Mathias L\'ecuyer

arXiv:2402.09477·cs.CR·October 29, 2024·1 cites

PANORAMIA: Privacy Auditing of Machine Learning Models without Retraining

Mishaal Kazmi, Hadrien Lautraite, Alireza Akbari, Qiaoyue Tang,, Mauricio Soroco, Tao Wang, S\'ebastien Gambs, Mathias L\'ecuyer

PDF

Open Access 1 Repo 1 Video

TL;DR

PANORAMIA is a privacy auditing framework for machine learning models that uses membership inference attacks with generated non-member data, avoiding retraining or model modification.

Contribution

It introduces a novel privacy measurement method that does not require in-distribution non-member data or retraining, simplifying privacy audits.

Findings

01

Effective on image, tabular, and language models.

02

Does not require access to original training data.

03

Eliminates need for retraining during privacy assessment.

Abstract

We present PANORAMIA, a privacy leakage measurement framework for machine learning models that relies on membership inference attacks using generated data as non-members. By relying on generated non-member data, PANORAMIA eliminates the common dependency of privacy measurement tools on in-distribution non-member data. As a result, PANORAMIA does not modify the model, training data, or training process, and only requires access to a subset of the training data. We evaluate PANORAMIA on ML models for image and tabular data classification, as well as on large-scale language models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ubc-systopia/panoramia-privacy-measurement
pytorchOfficial

Videos

PANORAMIA: Privacy Auditing of Machine Learning Models without Retraining· slideslive

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Explainable Artificial Intelligence (XAI) · Big Data and Business Intelligence