# High-dimensional confounding adjustment using continuous spike and slab   priors

**Authors:** Joseph Antonelli, Giovanni Parmigiani, Francesca Dominici

arXiv: 1704.07532 · 2018-10-18

## TL;DR

This paper introduces a novel Bayesian method using continuous spike and slab priors for confounding adjustment in high-dimensional observational studies, improving causal effect estimation when the number of confounders exceeds sample size.

## Contribution

It proposes a new prior-based approach that reduces confounding bias and shrinks coefficients of instrumental variables, outperforming existing methods in high-dimensional settings.

## Key findings

- Reduces confounding bias in high-dimensional data
- Shrinks coefficients of instrumental variables effectively
- Achieves good coverage in small samples

## Abstract

In observational studies, estimation of a causal effect of a treatment on an outcome relies on proper adjustment for confounding. If the number of the potential confounders ($p$) is larger than the number of observations ($n$), then direct control for all potential confounders is infeasible. Existing approaches for dimension reduction and penalization are generally aimed at predicting the outcome, and are less suited for estimation of causal effects. Under standard penalization approaches (e.g. Lasso), if a variable $X_j$ is strongly associated with the treatment $T$ but weakly with the outcome $Y$, the coefficient $\beta_j$ will be shrunk towards zero thus leading to confounding bias.   Under the assumption of a linear model for the outcome and sparsity, we propose continuous spike and slab priors on the regression coefficients $\beta_j$ corresponding to the potential confounders $X_j$. Specifically, we introduce a prior distribution that does not heavily shrink to zero the coefficients ($\beta_j$s) of the $X_j$s that are strongly associated with $T$ but weakly associated with $Y$. We compare our proposed approach to several state of the art methods proposed in the literature. Our proposed approach has the following features: 1) it reduces confounding bias in high dimensional settings; 2) it shrinks towards zero coefficients of instrumental variables; and 3) it achieves good coverages even in small sample sizes. We apply our approach to the National Health and Nutrition Examination Survey (NHANES) data to estimate the causal effects of persistent pesticide exposure on triglyceride levels.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.07532/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1704.07532/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/1704.07532/full.md

---
Source: https://tomesphere.com/paper/1704.07532