Hypothesis Testing of One-Sample Mean Vector in Distributed Frameworks
Bin Du, Junlong Zhao

TL;DR
This paper develops distributed hypothesis tests for the mean vector in large-scale data settings, balancing communication costs and statistical power, and extends classical tests to distributed frameworks for both low and high dimensions.
Contribution
It introduces novel distributed test statistics for mean vector hypotheses, reducing communication costs while analyzing the power tradeoffs compared to centralized tests.
Findings
Distributed tests significantly reduce communication costs.
Tradeoff exists between test power and communication efficiency.
Numerical results validate theoretical insights.
Abstract
Distributed frameworks are widely used to handle massive data, where sample size is very large, and data are often stored in different machines. For a random vector with expectation , testing the mean vector vs for a given vector is a basic problem in statistics. The centralized test statistics require heavy communication costs, which can be a burden when or is large. To reduce the communication cost, distributed test statistics are proposed in this paper for this problem based on the divide and conquer technique, a commonly used approach for distributed statistical inference. Specifically, we extend two commonly used centralized test statistics to the distributed ones, that apply to low and high dimensional cases, respectively. Comparing the power of centralized test statistics and the distributed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Distributed Sensor Networks and Detection Algorithms · Markov Chains and Monte Carlo Methods
