Improved Inference for Respondent-Driven Sampling Data with Application to HIV Prevalence Estimation
Krista J. Gile

TL;DR
This paper introduces a new successive sampling estimator for respondent-driven sampling data that reduces bias and improves population mean estimates, especially when population size is known or uncertain.
Contribution
It presents a novel successive sampling approach that respects the without-replacement nature of respondent-driven sampling, improving estimation accuracy over existing methods.
Findings
The new estimator outperforms existing estimators when population size is known.
Sensitivity analysis shows robustness of the estimator with unknown population sizes.
Application to real data demonstrates practical utility across different populations.
Abstract
Respondent-driven sampling is a form of link-tracing network sampling, which is widely used to study hard-to-reach populations, often to estimate population proportions. Previous treatments of this process have used a with-replacement approximation, which we show induces bias in estimates for large sample fractions and differential network connectedness by characteristic of interest. We present a treatment of respondent-driven sampling as a successive sampling process. Unlike existing representations, our approach respects the essential without-replacement feature of the process, while converging to an existing with-replacement representation for small sample fractions, and to the sample mean for a full-population sample. We present a successive-sampling based estimator for population means based on respondent-driven sampling data, and demonstrate its superior performance when the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHIV, Drug Use, Sexual Risk · HIV/AIDS Research and Interventions · Data-Driven Disease Surveillance
