Every Query Counts: Analyzing the Privacy Loss of Exploratory Data Analyses
Saskia Nu\~nez von Voigt, Mira Pauli, Johanna Reichert, Florian, Tschorsch

TL;DR
This paper quantifies the privacy loss incurred during exploratory data analysis, emphasizing its significance in the overall privacy budget for machine learning workflows.
Contribution
It provides a detailed analysis of privacy loss for common statistical functions used in exploratory data analysis, highlighting the need to include this in privacy accounting.
Findings
Quantifies privacy loss for basic statistical functions.
Shows importance of accounting for privacy loss in exploratory analysis.
Highlights potential underestimation of privacy risk in current practices.
Abstract
An exploratory data analysis is an essential step for every data analyst to gain insights, evaluate data quality and (if required) select a machine learning model for further processing. While privacy-preserving machine learning is on the rise, more often than not this initial analysis is not counted towards the privacy budget. In this paper, we quantify the privacy loss for basic statistical functions and highlight the importance of taking it into account when calculating the privacy-loss budget of a machine learning approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
