The Sample Complexity of Search over Multiple Populations
Matthew L. Malloy, Gongguo Tang, Robert D. Nowak

TL;DR
This paper analyzes the sample complexity of efficiently identifying a population with distribution P1 among many, comparing adaptive and non-adaptive sampling strategies and providing bounds and explicit formulas for Gaussian and Bernoulli cases.
Contribution
It introduces bounds on sample complexity for search over multiple populations and proposes near-optimal adaptive sampling schemes with explicit performance expressions.
Findings
Adaptive sampling schemes approach the lower bound on sample complexity.
Explicit formulas for Gaussian and Bernoulli distributions.
Non-adaptive schemes require significantly more samples.
Abstract
This paper studies the sample complexity of searching over multiple populations. We consider a large number of populations, each corresponding to either distribution P0 or P1. The goal of the search problem studied here is to find one population corresponding to distribution P1 with as few samples as possible. The main contribution is to quantify the number of samples needed to correctly find one such population. We consider two general approaches: non-adaptive sampling methods, which sample each population a predetermined number of times until a population following P1 is found, and adaptive sampling methods, which employ sequential sampling schemes for each population. We first derive a lower bound on the number of samples required by any sampling scheme. We then consider an adaptive procedure consisting of a series of sequential probability ratio tests, and show it comes within a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
