Tailoring Capture-Recapture Methods to Estimate Registry-Based Case Counts Based on Error-Prone Diagnostic Signals
Lin Ge, Yuzi Zhang, Kevin C. Ward, Timothy L. Lash, Lance A. Waller, and Robert H. Lyles

TL;DR
This paper introduces an improved capture-recapture method that uses a small, validated sample and existing signaling data streams to accurately estimate disease recurrence case counts, accounting for diagnostic errors.
Contribution
It extends the anchor stream sampling design to handle false positive and negative signals, enabling valid case count estimation with minimal validation effort.
Findings
Method provides accurate estimates with valid confidence intervals.
Simulation studies demonstrate improved efficiency over traditional methods.
Application to Georgia Cancer Registry data illustrates practical utility.
Abstract
Surveillance research is of great importance for effective and efficient epidemiological monitoring of case counts and disease prevalence. Taking specific motivation from ongoing efforts to identify recurrent cases based on the Georgia Cancer Registry, we extend recently proposed "anchor stream" sampling design and estimation methodology. Our approach offers a more efficient and defensible alternative to traditional capture-recapture (CRC) methods by leveraging a relatively small random sample of participants whose recurrence status is obtained through a principled application of medical records abstraction. This sample is combined with one or more existing signaling data streams, which may yield data based on arbitrarily non-representative subsets of the full registry population. The key extension developed here accounts for the common problem of false positive or negative diagnostic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCensus and Population Estimation · Data-Driven Disease Surveillance · Statistical Methods and Bayesian Inference
