Generalized raking and stabilized weights for regression modeling in two-phase samples
Tong Chen, Joshua Slone, Gustavo Amorim, Pamela A. Shaw, Bryan E. Shepherd, Thomas Lumley

TL;DR
This paper introduces a method combining stabilized weights with generalized raking to improve efficiency in regression modeling for two-phase survey samples, demonstrated through simulations and a real study.
Contribution
It proposes a novel combination of stabilized weights and generalized raking for two-phase sampling, enhancing estimator precision and practical implementation.
Findings
Simulation studies show improved precision with the proposed estimator.
Efficiency gains are context-dependent, limited in highly informative designs.
Application to a large HIV study demonstrates practical utility.
Abstract
In regression models fitted to data from complex survey designs, sampling weights often incorporate non-essential variation, inflating variance estimates. Stabilized weights mitigate this issue by adjusting sampling weights to account for variation explained by covariates. In the context of two-phase sampling, we evaluate the performance of optimal stabilized weights and propose combining the stabilized weight estimator with generalized raking, a class of efficient design-based estimators. This combination improves efficiency by reducing unnecessary weight variation and leveraging information from auxiliary variables. We show this combination can be implemented using the standard statistical package that handles two-phase samples and generalized raking. Simulation studies demonstrate that the proposed estimator enhances precision under realistic two-phase designs, though efficiency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
