An examination of the generalised pooled binomial distribution and its information properties
Ben O'Neill, Angus McLure

TL;DR
This paper investigates the statistical properties and information content of a generalized pooled binomial distribution, including its application to pooled testing data, and provides methods for optimal pooling and parameter estimation.
Contribution
It introduces a generalized pooled binomial distribution with covariate effects, analyzes its information properties, and develops estimation and diagnostic methods for pooled testing.
Findings
Pooling reduces information content of individual samples.
Heuristics for optimal pool size are provided.
Maximum likelihood estimation methods are derived.
Abstract
This paper examines the statistical properties of a distributional form that arises from pooled testing for the prevalence of a binary outcome. Our base distribution is a two-parameter distribution using a prevalence and excess intensity parameter; the latter is included to allow for a dilution or intensification effect with larger pools. We also examine a generalised form of the distribution where pools have covariate information that affects the prevalence through a linked linear form. We study the general pooled binomial distribution in its own right and as a special case of broader forms of binomial GLMs using the complementary log-log link function. We examine the information function and show the information content of individual sample items. We demonstrate that pooling reduces information content of sample units and we give simple heuristics for choosing an "optimal" pool size…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Distribution Estimation and Applications · SARS-CoV-2 detection and testing · Statistical Methods and Bayesian Inference
