Survey: Leakage and Privacy at Inference Time
Marija Jegorova, Chaitanya Kaul, Charlie Mayor, Alison Q. O'Neil,, Alexander Weir, Roderick Murray-Smith, and Sotirios A. Tsaftaris

TL;DR
This survey comprehensively reviews inference-time data leakage in machine learning models, covering types of leakage, attack methods, defenses, metrics, and future research directions.
Contribution
It provides a detailed taxonomy and analysis of leakage types, attack and defense mechanisms, and assessment metrics specific to inference-time leakage in ML models.
Findings
Involuntary leakage is inherent to ML models and varies across data types and tasks.
Privacy attacks exploit model vulnerabilities to extract sensitive information.
Current defenses include various mitigation techniques and evaluation metrics.
Abstract
Leakage of data from publicly available Machine Learning (ML) models is an area of growing significance as commercial and government applications of ML can draw on multiple sources of data, potentially including users' and clients' sensitive data. We provide a comprehensive survey of contemporary advances on several fronts, covering involuntary data leakage which is natural to ML models, potential malevolent leakage which is caused by privacy attacks, and currently available defence mechanisms. We focus on inference-time leakage, as the most likely scenario for publicly available models. We first discuss what leakage is in the context of different data, tasks, and model architectures. We then propose a taxonomy across involuntary and malevolent leakage, available defences, followed by the currently available assessment metrics and applications. We conclude with outstanding challenges…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Security and Verification in Computing
