Extending Model-x Framework to Missing Data
Deniz Koyuncu, Alex Gittens, B\"ulent Yener

TL;DR
This paper extends the model-x knockoffs framework to handle datasets with missing data by proposing methods that preserve false discovery control and demonstrate theoretical guarantees.
Contribution
It introduces approaches for integrating missing data handling into model-x knockoffs, including posterior sampling and joint imputation, maintaining statistical guarantees.
Findings
Posterior sampled imputation allows reuse of knockoff samplers with missing data.
Sampling knockoffs for observed variables with univariate imputation preserves false discovery control.
Joint imputation and sampling reduce computational complexity in latent variable models.
Abstract
One limitation of the most statistical/machine learning-based variable selection approaches is their inability to control the false selections. A recently introduced framework, model-x knockoffs, provides that to a wide range of models but lacks support for datasets with missing values. In this work, we discuss ways of preserving the theoretical guarantees of the model-x framework in the missing data setting. First, we prove that posterior sampled imputation allows reusing existing knockoff samplers in the presence of missing values. Second, we show that sampling knockoffs only for the observed variables and applying univariate imputation also preserves the false selection guarantees. Third, for the special case of latent variable models, we demonstrate how jointly imputing and sampling knockoffs can reduce the computational complexity. We have verified the theoretical findings with two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Statistical Methods and Inference · Bayesian Methods and Mixture Models
