Privacy Side Channels in Machine Learning Systems

Edoardo Debenedetti; Giorgio Severi; Nicholas Carlini; Christopher A.; Choquette-Choo; Matthew Jagielski; Milad Nasr; Eric Wallace; Florian Tram\`er

arXiv:2309.05610·cs.CR·July 19, 2024·6 cites

Privacy Side Channels in Machine Learning Systems

Edoardo Debenedetti, Giorgio Severi, Nicholas Carlini, Christopher A., Choquette-Choo, Matthew Jagielski, Milad Nasr, Eric Wallace, Florian Tram\`er

PDF

Open Access

TL;DR

This paper reveals how system-level components in machine learning systems can be exploited as privacy side channels, enabling attacks like membership inference and data exfiltration that undermine privacy guarantees.

Contribution

It introduces the concept of privacy side channels in ML systems, categorizes them across the ML lifecycle, and demonstrates their potential to breach privacy protections.

Findings

01

Deduplicating training data can invalidate differential privacy guarantees.

02

Blocking language model regeneration can be exploited to exfiltrate private keys.

03

System-level components can be exploited to perform privacy attacks beyond standalone models.

Abstract

Most current approaches for protecting privacy in machine learning (ML) assume that models exist in a vacuum. Yet, in reality, these models are part of larger systems that include components for training data filtering, output monitoring, and more. In this work, we introduce privacy side channels: attacks that exploit these system-level components to extract private information at far higher rates than is otherwise possible for standalone models. We propose four categories of side channels that span the entire ML lifecycle (training data filtering, input preprocessing, output post-processing, and query filtering) and allow for enhanced membership inference, data extraction, and even novel threats such as extraction of users' test queries. For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Cryptography and Data Security