Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets

Florian Tram\`er; Reza Shokri; Ayrton San Joaquin; Hoang Le; and Matthew Jagielski; Sanghyun Hong; Nicholas Carlini

arXiv:2204.00032·cs.CR·October 7, 2022

Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets

Florian Tram\`er, Reza Shokri, Ayrton San Joaquin, Hoang Le, and Matthew Jagielski, Sanghyun Hong, Nicholas Carlini

PDF

TL;DR

This paper presents novel poisoning attacks on machine learning models that can leak private training data, significantly undermining privacy guarantees and highlighting vulnerabilities in current privacy-preserving protocols.

Contribution

It introduces active inference attacks that connect data poisoning with privacy breaches, demonstrating their effectiveness across various inference tasks.

Findings

01

Poisoning less than 0.1% of data boosts inference attacks by 10-100x

02

Controlling 50% of training data enables 8x more precise private data inference

03

Attacks threaten cryptographic privacy guarantees in multiparty ML protocols

Abstract

We introduce a new class of attacks on machine learning models. We show that an adversary who can poison a training dataset can cause models trained on this dataset to leak significant private details of training points belonging to other parties. Our active inference attacks connect two independent lines of work targeting the integrity and privacy of machine learning training data. Our attacks are effective across membership inference, attribute inference, and data extraction. For example, our targeted attacks can poison <0.1% of the training dataset to boost the performance of inference attacks by 1 to 2 orders of magnitude. Further, an adversary who controls a significant fraction of the training data (e.g., 50%) can launch untargeted attacks that enable 8x more precise inference on all other users' otherwise-private data points. Our results cast doubts on the relevance of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.