Big data, big problems: Responding to "Are we there yet?"
Alex Reinhart, Ryan Tibshirani

TL;DR
This paper critiques the data defect correlation framework used to evaluate survey bias, demonstrating its limitations and advocating for a broader, more comprehensive approach to assessing survey quality, especially in the context of COVID-19 data.
Contribution
It identifies key limitations of the data defect correlation framework and proposes adopting the Total Survey Error framework for a more complete assessment of survey quality.
Findings
CTIS performs well for its intended goals.
Data defect correlation is limited to population point estimation.
A broader framework improves survey bias analysis.
Abstract
Bradley et al. (arXiv:2106.05818v2), as part of an analysis of the performance of large-but-biased surveys during the COVID-19 pandemic, argue that the data defect correlation provides a useful tool to quantify the effects of sampling bias on survey results. We examine their analyses of results from the COVID-19 Trends and Impact Survey (CTIS) and show that, despite their claims, CTIS in fact performs well for its intended goals. Our examination reveals several limitations in the data defect correlation framework, including that it is only applicable for a single goal (population point estimation) and that it does not admit the possibility of measurement error. Through examples, we show that these limitations seriously affect the applicability of the framework for analyzing CTIS results. Through our own alternative analyses, we arrive at different conclusions, and we argue for a more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 epidemiological studies · Data-Driven Disease Surveillance · Survey Methodology and Nonresponse
