Model Selection for Causal Modeling in Missing Exposure Problems

Yuliang Shi; Yeying Zhu; Joel A. Dubin

arXiv:2406.12171·stat.ME·December 16, 2024

Model Selection for Causal Modeling in Missing Exposure Problems

Yuliang Shi, Yeying Zhu, Joel A. Dubin

PDF

Open Access

TL;DR

This paper investigates model selection strategies for causal inference with missing exposure data, proposing a new criterion called 'rank score' to optimize model choice and improve causal effect estimation accuracy.

Contribution

It introduces a novel 'rank score' criterion for selecting imputation and propensity score models in causal inference with missing at random data.

Findings

01

Full imputation plus outcome-related PS models minimize RMSE.

02

Rank score effectively identifies the best models.

03

Application demonstrates causal effect estimation in COVID-19 mortality.

Abstract

In causal inference, properly selecting the propensity score (PS) model is an important topic and has been widely investigated in observational studies. There is also a large literature focusing on the missing data problem. However, there are very few studies investigating the model selection issue for causal inference when the exposure is missing at random (MAR). In this paper, we discuss how to select both imputation and PS models, which can result in the smallest root mean squared error (RMSE) of the estimated causal effect in our simulation study. Then, we propose a new criterion, called ``rank score'' for evaluating the overall performance of both models. The simulation studies show that the full imputation plus the outcome-related PS models lead to the smallest RMSE and the rank score can help select the best models. An application study is conducted to quantify the causal effect…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems · Bayesian Modeling and Causal Inference