
TL;DR
This paper examines the subtle deviations from simple binomial models in gender distribution data, highlighting overdispersion, dependence, and family-level variations, and discusses how sample size affects statistical inference.
Contribution
It introduces analysis of overdispersion and dependence in gender data, revealing small but significant deviations from independent binomial assumptions.
Findings
Gender ratios are approximately equal but show slight imbalances.
Family-level variations cause overdispersion beyond binomial expectations.
Sample size impacts the detection of statistical deviations.
Abstract
Take a look around you -- in your family, your school or workplace, in the streets, and you see boys & girls in about equal proportion, and without any easily visible gender patterns in case of siblings. So, to the famous first order of statistical approximation, we're all the results of hierarchical cascades of independent coin tosses through history, with each little fate determined by a 0.50-0.50 coin. This is not entirely correct, as one discovers with careful analysis and enough data: the coins of fate are (a little) imbalanced; they vary (a little) from family to family; there is a (slight) dependence in your children's gender sequence; and there are (slightly) more only-girls and only-boys families than predicted from binomial conditions. In this article I use the opportunity to talk also about how sample sizes influence p-values and statistical detection power.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
