Towards Robust Federated Analytics via Differentially Private Measurements of Statistical Heterogeneity
Mary Scott, Graham Cormode, Carsten Maple

TL;DR
This paper develops differentially private methods to measure statistical heterogeneity in federated datasets, providing formulas and an analytic mechanism that improves accuracy and robustness over existing approaches.
Contribution
It introduces an analytic mechanism for differentially private heterogeneity measurement in federated settings, optimizing privacy parameters and demonstrating superior accuracy.
Findings
The analytic mechanism outperforms classic and centralized methods.
Statistical heterogeneity measures retain accuracy with heterogeneous samples.
The approach is robust across different heterogeneity levels.
Abstract
Statistical heterogeneity is a measure of how skewed the samples of a dataset are. It is a common problem in the study of differential privacy that the usage of a statistically heterogeneous dataset results in a significant loss of accuracy. In federated scenarios, statistical heterogeneity is more likely to happen, and so the above problem is even more pressing. We explore the three most promising ways to measure statistical heterogeneity and give formulae for their accuracy, while simultaneously incorporating differential privacy. We find the optimum privacy parameters via an analytic mechanism, which incorporates root finding methods. We validate the main theorems and related hypotheses experimentally, and test the robustness of the analytic mechanism to different heterogeneity levels. The analytic mechanism in a distributed setting delivers superior accuracy to all combinations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Random Matrices and Applications · Cryptography and Data Security
