Optimal Private and Communication Constraint Distributed Goodness-of-Fit Testing for Discrete Distributions in the Large Sample Regime
Lasse Vuursteen

TL;DR
This paper develops optimal methods for distributed goodness-of-fit testing of discrete distributions under bandwidth and privacy constraints, providing theoretical bounds in large-sample regimes.
Contribution
It extends Gaussian mean testing results to discrete distributions using Le Cam's theory, deriving matching minimax bounds under privacy and bandwidth constraints.
Findings
Derived matching minimax bounds for discrete goodness-of-fit testing.
Extended Gaussian testing results to discrete distributions via Le Cam's theory.
Established optimal testing procedures in large-sample distributed settings.
Abstract
We study distributed goodness-of-fit testing for discrete distribution under bandwidth and differential privacy constraints. Information constraint distributed goodness-of-fit testing is a problem that has received considerable attention recently. The important case of discrete distributions is theoretically well understood in the classical case where all data is available in one "central" location. In a federated setting, however, data is distributed across multiple "locations" (e.g. servers) and cannot readily be shared due to e.g. bandwidth or privacy constraints that each server needs to satisfy. We show how recently derived results for goodness-of-fit testing for the mean of a multivariate Gaussian model extend to the discrete distributions, by leveraging Le Cam's theory of statistical equivalence. In doing so, we derive matching minimax upper- and lower-bounds for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed Sensor Networks and Detection Algorithms · Statistical Methods and Bayesian Inference · Advanced Statistical Process Monitoring
