On the Random Sampling of Pairs, with Pedestrian examples
Richard Arratia, Stephen DeSalvo

TL;DR
This paper investigates the difference between two methods of sampling pairs from a distribution, quantifies their discrepancy, and finds exact extremes for small cases, providing insights into sampling ambiguities.
Contribution
It introduces the concept of discrepancy between pair-color distributions and derives exact extremal values for two and three colors, proposing a conjecture for the general case.
Findings
Exact extremal discrepancy for two colors
Exact extremal discrepancy for three colors
A conjecture for the maximum discrepancy in general cases
Abstract
Suppose one desires to randomly sample a pair of objects such as socks, hoping to get a matching pair. Even in the simplest situation for sampling, which is sampling with replacement, the innocent phrase "the distribution of the color of a matching pair" is ambiguous. One interpretation is that we condition on the event of getting a match between two random socks; this corresponds to sampling two at a time, over and over without memory, until a matching pair is found. A second interpretation is to sample sequentially, one at a time, with memory, until the same color has been seen twice. We study the difference between these two methods. The input is a discrete probability distribution on colors, describing what happens when one sock is sampled. There are two derived distributions --- the pair-color distributions under the two methods of getting a match. The output, a number we call…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Numerical Analysis Techniques
