Fast Bayesian Variable Selection in Binomial and Negative Binomial Regression
Martin Jankowiak

TL;DR
This paper introduces an efficient MCMC method for Bayesian variable selection in binomial and negative binomial regression models, enabling analysis of high-dimensional count data with improved computational feasibility.
Contribution
It develops a novel MCMC scheme using Tempered Gibbs Sampling tailored for variable selection in complex generalized linear models, including logistic regression.
Findings
Effective on large datasets with 17,000 covariates
Demonstrates improved computational efficiency
Applicable to diverse count data scenarios
Abstract
Bayesian variable selection is a powerful tool for data analysis, as it offers a principled method for variable selection that accounts for prior information and uncertainty. However, wider adoption of Bayesian variable selection has been hampered by computational challenges, especially in difficult regimes with a large number of covariates or non-conjugate likelihoods. Generalized linear models for count data, which are prevalent in biology, ecology, economics, and beyond, represent an important special case. Here we introduce an efficient MCMC scheme for variable selection in binomial and negative binomial regression that exploits Tempered Gibbs Sampling (Zanella and Roberts, 2019) and that includes logistic regression as a special case. In experiments we demonstrate the effectiveness of our approach, including on cancer data with seventeen thousand covariates.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference · Metabolomics and Mass Spectrometry Studies
MethodsLogistic Regression
