Doubly Robust Regression Analysis for Data Fusion
Katherine Evans, BaoLuo Sun, James Robins, Eric J. Tchetgen, Tchetgen

TL;DR
This paper introduces doubly robust semiparametric estimators for regression analysis using fused data from two sources with different observed variables, ensuring consistent inference even with complex missing data patterns.
Contribution
It develops a class of doubly robust estimators tailored for data fusion scenarios, extending regression analysis methods to handle extreme missing data configurations.
Findings
Estimators are consistent and asymptotically normal under correct model specifications.
Simulation studies demonstrate the estimators' robustness and efficiency.
Application to U.S. household data reveals relationships between net asset value and expenditure.
Abstract
This paper investigates the problem of making inference about a parametric model for the regression of an outcome variable on covariates when data are fused from two separate sources, one which contains information only on while the other contains information only on covariates. This data fusion setting may be viewed as an extreme form of missing data in which the probability of observing complete data on any given subject is zero. We have developed a large class of semiparametric estimators, which includes doubly robust estimators, of the regression coefficients in fused data. The proposed method is DR in that it is consistent and asymptotically normal if, in addition to the model of interest, we correctly specify a model for either the data source process under an ignorability assumption, or the distribution of unobserved covariates. We evaluate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
