# Evaluating a Targeted Minimum Loss-Based Estimator for Capture-Recapture Analysis: An Application to HIV Surveillance in San Francisco, California

**Authors:** Paul Wesson, Manjari Das, Mia Chen, Ling Hsu, Willi McFarland, Edward Kennedy, Nicholas P Jewell

PMC · DOI: 10.1093/aje/kwad231 · American Journal of Epidemiology · 2023-11-17

## TL;DR

This paper introduces a new statistical method called TMLE to improve estimates of hidden populations, like people with HIV in San Francisco, by reducing bias and increasing accuracy.

## Contribution

The novel TMLE model improves capture-recapture analysis by targeting parameters of interest and reducing bias from small data cells.

## Key findings

- The TMLE model estimated 13,523 people living with HIV in San Francisco in 2019, close to the actual count of 12,507.
- TMLE outperformed other models in accuracy and precision using real-world HIV surveillance data.
- Simulations confirmed TMLE's reliability in estimating hidden population sizes.

## Abstract

The capture-recapture method is a common tool used in epidemiology to estimate the size of “hidden” populations and correct the underascertainment of cases, based on incomplete and overlapping lists of the target population. Log-linear models are often used to estimate the population size yet may produce implausible and unreliable estimates due to model misspecification and small cell sizes. A novel targeted minimum loss-based estimation (TMLE) model developed for capture-recapture makes several notable improvements to conventional modeling: “targeting” the parameter of interest, flexibly fitting the data to alternative functional forms, and limiting bias from small cell sizes. Using simulations and empirical data from the San Francisco, California, Department of Public Health’s human immunodeficiency virus (HIV) surveillance registry, we evaluated the performance of the TMLE model and compared results with those of other common models. Based on 2,584 people observed on 3 lists reportable to the surveillance registry, the TMLE model estimated the number of San Francisco residents living with HIV as of December 31, 2019, to be 13,523 (95% confidence interval: 12,222, 14,824). This estimate, compared with a “ground truth” of 12,507, was the most accurate and precise of all models examined. The TMLE model is a significant advancement in capture-recapture studies, leveraging modern statistical methods to improve estimation of the sizes of hidden populations.

## Full-text entities

- **Diseases:** HIV (MESH:D015658)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC10999650/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC10999650/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/PMC10999650/full.md

---
Source: https://tomesphere.com/paper/PMC10999650