Information Leakage from Data Updates in Machine Learning Models

Tian Hui; Farhad Farokhi; Olga Ohrimenko

arXiv:2309.11022·cs.LG·January 4, 2024

Information Leakage from Data Updates in Machine Learning Models

Tian Hui, Farhad Farokhi, Olga Ohrimenko

PDF

Open Access

TL;DR

This paper explores how adversaries can infer information about data updates in machine learning models by analyzing model snapshots, revealing vulnerabilities especially with rare attribute values and repeated updates.

Contribution

It introduces novel attack methods based on prediction confidence differences to infer data attribute changes during model updates, highlighting privacy risks.

Findings

01

Model snapshots leak more information than single models.

02

Rare attribute values are more vulnerable to inference attacks.

03

Repeated updates increase the likelihood of correct attribute inference.

Abstract

In this paper we consider the setting where machine learning models are retrained on updated datasets in order to incorporate the most up-to-date information or reflect distribution shifts. We investigate whether one can infer information about these updates in the training data (e.g., changes to attribute values of records). Here, the adversary has access to snapshots of the machine learning model before and after the change in the dataset occurs. Contrary to the existing literature, we assume that an attribute of a single or multiple training data points are changed rather than entire data records are removed or added. We propose attacks based on the difference in the prediction confidence of the original model and the updated model. We evaluate our attack methods on two public datasets along with multi-layer perceptron and logistic regression models. We validate that two snapshots of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)

MethodsLogistic Regression