Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets
Florian Tram\`er, Reza Shokri, Ayrton San Joaquin, Hoang Le, and Matthew Jagielski, Sanghyun Hong, Nicholas Carlini

TL;DR
This paper presents novel poisoning attacks on machine learning models that can leak private training data, significantly undermining privacy guarantees and highlighting vulnerabilities in current privacy-preserving protocols.
Contribution
It introduces active inference attacks that connect data poisoning with privacy breaches, demonstrating their effectiveness across various inference tasks.
Findings
Poisoning less than 0.1% of data boosts inference attacks by 10-100x
Controlling 50% of training data enables 8x more precise private data inference
Attacks threaten cryptographic privacy guarantees in multiparty ML protocols
Abstract
We introduce a new class of attacks on machine learning models. We show that an adversary who can poison a training dataset can cause models trained on this dataset to leak significant private details of training points belonging to other parties. Our active inference attacks connect two independent lines of work targeting the integrity and privacy of machine learning training data. Our attacks are effective across membership inference, attribute inference, and data extraction. For example, our targeted attacks can poison <0.1% of the training dataset to boost the performance of inference attacks by 1 to 2 orders of magnitude. Further, an adversary who controls a significant fraction of the training data (e.g., 50%) can launch untargeted attacks that enable 8x more precise inference on all other users' otherwise-private data points. Our results cast doubts on the relevance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
