Efficient Targeted Maximum Likelihood Estimators for Two-Phase Design Problems
Sky Qiu, Susan Gruber, Pamela A. Shaw, Brian D. Williamson, Mark J. van der Laan

TL;DR
This paper introduces a new class of efficient targeted maximum likelihood estimators for two-phase sampling designs, improving upon existing methods by leveraging the TMLE framework to handle coarsened data structures.
Contribution
The paper develops a novel class of TMLE-based estimators specifically designed for two-phase sampling problems, enhancing efficiency over existing approaches.
Findings
New estimators are asymptotically equivalent within the TMLE framework.
The proposed methods outperform existing estimators in efficiency.
The approach effectively handles coarsening at random in two-phase designs.
Abstract
In a typical two-phase design, a random sample is drawn from the target population in phase 1, during which only a subset of variables is collected. In phase 2, a subsample of the phase-1 cohort is selected, and additional variables are measured. This setting induces a coarsened data structure on the data from the second phase. We assume coarsening at random, that is, the phase-2 sampling mechanism depends only on variables fully observed. We review existing estimators, including the generalized raking estimator and the inverse probability of censoring weighted targeted maximum likelihood estimation (IPCW-TMLE) along with its extensions that also target the phase-2 sampling mechanism to improve efficiency. We further introduce a new class of estimators constructed within the TMLE framework that are asymptotically equivalent.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurvey Sampling and Estimation Techniques · Statistical Distribution Estimation and Applications · Statistical Methods and Bayesian Inference
