Differentially Private Boxplots
Kelly Ramsay, Jairo Diaz-Rodriguez

TL;DR
This paper introduces a differentially private boxplot that effectively visualizes data distribution features while maintaining privacy, with theoretical guarantees and practical performance comparable to non-private methods.
Contribution
It presents the first differentially private boxplot, with optimal estimation of location and scale, and consistent skewness and tail estimation, along with new private quantile results.
Findings
Performs similarly to non-private boxplots in simulations
Outperforms naive differentially private boxplots
Enables privacy-preserving analysis of real datasets like Airbnb
Abstract
Despite the potential of differentially private data visualization to harmonize data analysis and privacy, research in this area remains underdeveloped. Boxplots are a widely popular visualization used for summarizing a dataset and for comparison of multiple datasets. Consequentially, we introduce a differentially private boxplot. We evaluate its effectiveness for displaying location, scale, skewness and tails of a given empirical distribution. In our theoretical exposition, we show that the location and scale of the boxplot are estimated with optimal sample complexity, and the skewness and tails are estimated consistently, which is not always the case for a boxplot naively constructed from a single existing differentially private quantile algorithm. As a byproduct of this exposition, we introduce several new results concerning private quantile estimation. In simulations, we show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsData-Driven Disease Surveillance
