New Survey Questions and Estimators for Network Clustering with Respondent-Driven Sampling Data
Ashton M. Verdery, Jacob C. Fisher, Nalyn Siripong, Kahina Abdesselam,, Shawn Bauldry

TL;DR
This paper introduces new survey questions and estimators for assessing network clustering in respondent-driven sampling data, enabling better understanding of social network structures in hard-to-survey populations.
Contribution
It develops novel data collection instruments and estimators for network clustering in RDS, extending RDS's utility beyond prevalence estimation to network analysis.
Findings
Clustering coefficient estimators perform well in RDS samples.
Estimators are robust to multiple seeds, without replacement, and imperfect response rates.
The approach broadens RDS applications to social network analysis.
Abstract
Respondent-driven sampling (RDS) is a popular method for sampling hard-to-survey populations that leverages social network connections through peer recruitment. While RDS is most frequently applied to estimate the prevalence of infections and risk behaviors of interest to public health, like HIV/AIDS or condom use, it is rarely used to draw inferences about the structural properties of social networks among such populations because it does not typically collect the necessary data. Drawing on recent advances in computer science, we introduce a set of data collection instruments and RDS estimators for network clustering, an important topological property that has been linked to a network's potential for diffusion of information, disease, and health behaviors. We use simulations to explore how these estimators, originally developed for random walk samples of computer networks, perform when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHIV, Drug Use, Sexual Risk · Complex Network Analysis Techniques · Data-Driven Disease Surveillance
