Optimal-Design Domain-Adaptation for Exposure Prediction in Two-Stage Epidemiological Studies
Ron Sarafian, Itai Kloog, Jonathan D. Rosenblatt

TL;DR
This paper introduces a novel method combining optimal design and domain adaptation to improve exposure effect estimation in two-stage epidemiological studies, reducing bias and increasing accuracy.
Contribution
It proposes a new estimator that leverages importance weighting and domain adaptation to enhance imputation and effect estimation in two-stage studies.
Findings
More accurate exposure effect estimates than current methods
Smaller PM effect estimates on hyperglycemia risk with tighter confidence intervals
Effective sharing of information improves health effect estimation
Abstract
In the first stage of a two-stage study, the researcher uses a statistical model to impute the unobserved exposures. In the second stage, imputed exposures serve as covariates in epidemiological models. Imputation error in the first stage operate as measurement errors in the second stage, and thus bias exposure effect estimates. This study aims to improve the estimation of exposure effects by sharing information between the first and second stage. At the heart of our estimator is the observation that not all second-stage observations are equally important to impute. We thus borrow ideas from the optimal-experimental-design theory, to identify individuals of higher importance. We then improve the imputation of these individuals using ideas from the machine-learning literature of domain-adaptation. Our simulations confirm that the exposure effect estimates are more accurate than the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
