Data fusion for predicting long-term program impacts
Michael W. Robbins, Sebastian Bauhoff, Lane Burgette

TL;DR
This paper introduces a data fusion approach that combines short-term surrogate outcomes with auxiliary long-term data to predict the long-term impacts of interventions before final outcomes are available.
Contribution
It presents a novel method for imputing missing long-term outcomes using data fusion and surrogate outcomes, validated through simulations and a real case study.
Findings
The method accurately predicts long-term impacts in simulations.
Applied to health insurance data, it estimates significant mortality improvements.
Demonstrates potential for policy decision support before long-term data collection.
Abstract
Policymakers often require information on programs' long-term impacts that is not available when decisions are made. We demonstrate how data fusion methods may be used address the problem of missing final outcomes and predict long-run impacts of interventions before the requisite data are available. We implement this method by concatenating data on an intervention with auxiliary long-term data and then imputing missing long-term outcomes using short-term surrogate outcomes while approximating uncertainty with replication methods. We use simulations to examine the performance of the methodology and apply the method in a case study. Specifically, we fuse data on the Oregon Health Insurance Experiment with data from the National Longitudinal Mortality Study and estimate that being eligible to apply for subsidized health insurance will lead to a statistically significant improvement in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealthcare Policy and Management · Advanced Causal Inference Techniques · Health Systems, Economic Evaluations, Quality of Life
