Distribution inference risks: Identifying and mitigating sources of   leakage

Valentin Hartmann; L\'eo Meynent; Maxime Peyrard; Dimitrios; Dimitriadis; Shruti Tople; Robert West

arXiv:2209.08541·cs.CR·September 20, 2022

Distribution inference risks: Identifying and mitigating sources of leakage

Valentin Hartmann, L\'eo Meynent, Maxime Peyrard, Dimitrios, Dimitriadis, Shruti Tople, Robert West

PDF

Open Access 2 Repos

TL;DR

This paper analyzes the causes of distribution inference leaks in machine learning models, identifying key sources of leakage and proposing mitigation strategies, including causal learning techniques, to enhance privacy protections.

Contribution

It provides a theoretical and empirical analysis of leakage sources in distribution inference attacks and introduces principled mitigation methods, notably causal learning, to improve model privacy.

Findings

01

Identified three main sources of leakage: memorization of $ ext{E}[Y|X]$, wrong inductive bias, and finite data.

02

Causal learning techniques are more resilient against distributional membership inference.

03

Formalized distribution inference to enable reasoning about more complex adversaries.

Abstract

A large body of work shows that machine learning (ML) models can leak sensitive or confidential information about their training data. Recently, leakage due to distribution inference (or property inference) attacks is gaining attention. In this attack, the goal of an adversary is to infer distributional information about the training data. So far, research on distribution inference has focused on demonstrating successful attacks, with little attention given to identifying the potential causes of the leakage and to proposing mitigations. To bridge this gap, as our main contribution, we theoretically and empirically analyze the sources of information leakage that allows an adversary to perpetrate distribution inference attacks. We identify three sources of leakage: (1) memorizing specific information about the $E [Y ∣ X]$ (expected label given the feature values) of interest to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Deception detection and forensic psychology