Asymptotic Seed Bias in Respondent-driven Sampling
Yuling Yan, Bret Hanlon, Sebastien Roch, Karl Rohe

TL;DR
This paper investigates how initial seed selection biases affect the limiting distribution of estimators in respondent-driven sampling, revealing that some estimators are robust while others are significantly influenced by seed bias.
Contribution
It provides a theoretical analysis of seed bias effects on RDS estimators using branching process tools, highlighting conditions under which bias impacts the estimators' distributions.
Findings
GLS estimator is unaffected by seed bias under certain conditions.
VH estimator converges to a mixture distribution influenced by seed node.
Numerical experiments support theoretical results beyond Markov assumptions.
Abstract
Respondent-driven sampling (RDS) collects a sample of individuals in a networked population by incentivizing the sampled individuals to refer their contacts into the sample. This iterative process is initialized from some seed node(s). Sometimes, this selection creates a large amount of seed bias. Other times, the seed bias is small. This paper gains a deeper understanding of this bias by characterizing its effect on the limiting distribution of various RDS estimators. Using classical tools and results from multi-type branching processes (Kesten and Stigum, 1966), we show that the seed bias is negligible for the Generalized Least Squares (GLS) estimator and non-negligible for both the inverse probability weighted and Volz-Heckathorn (VH) estimators. In particular, we show that (i) above a critical threshold, VH converge to a non-trivial mixture distribution, where the mixture component…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHIV, Drug Use, Sexual Risk · HIV/AIDS Research and Interventions · Complex Network Analysis Techniques
