Hypothesis Testing in High-Dimensional Instrumental Variables Regression with an Application to Genomics Data
Jiarui Lu, Hongzhe Li

TL;DR
This paper develops hypothesis testing methods for high-dimensional instrumental variables regression, addressing challenges in genomics data where both genetic variants and gene expressions are high-dimensional, and applies these methods to identify gene-phenotype associations.
Contribution
It introduces novel hypothesis testing procedures for sparse high-dimensional IV models, including single coefficient tests and FDR-controlled multiple testing, with theoretical and empirical validation.
Findings
Methods effectively control false discovery rate in simulations
Proposed tests accurately identify true gene-phenotype associations
Application reveals genes linked to yeast growth under oxidative stress
Abstract
Gene expression and phenotype association can be affected by potential unmeasured confounders from multiple sources, leading to biased estimates of the associations. Since genetic variants largely explain gene expression variations, they can be used as instruments in studying the association between gene expressions and phenotype in the framework of high dimensional instrumental variable (IV) regression. However, because the dimensions of both genetic variants and gene expressions are often larger than the sample size, statistical inferences such as hypothesis testing for such high dimensional IV models are not trivial and have not been investigated in literature. The problem is more challenging since the instrumental variables (e.g., genetic variants) have to be selected among a large set of genetic variants. This paper considers the problem of hypothesis testing for sparse IV…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Statistical Methods and Inference · Spectroscopy and Chemometric Analyses
