Better understanding of the multivariate hypergeometric distribution with implications in design-based survey sampling
X.G. Duan

TL;DR
This paper offers a clearer understanding of the covariance structure of the multivariate hypergeometric distribution and explores its implications for various sampling methods and variance estimation techniques.
Contribution
It provides a transparent probabilistic interpretation of the covariance matrix and discusses its impact on sampling efficiency and variance estimation methods.
Findings
Covariance matrix explained via probabilistic symmetry
Implications for sampling efficiency in different methods
Insights into variance estimation techniques
Abstract
Multivariate hypergeometric distribution arises frequently in elementary statistics and probability courses, for simultaneously studying the occurence law of specified events, when sampling without replacement from a finite population with fixed number of classification. Covariance matrix of this distribution is well known to be identical to its multinomial counterpart multiplied by 1-(n-1)/(N-1), with N and n being population and sample sizes, respectively. It appears to however, have been less discussed in the literature about the meaning of this relationship, especially regarding the specific form of the multiplier. Based on an augmenting argument together with probabilistic symmetry, we present a more transparent understanding for the covariance structure of the multivariate hypergeometric distribution. We discuss implications of these combined techniques and provide a unified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Statistical Distribution Estimation and Applications · Water Quality and Resources Studies
