Variable Selection with ABC Bayesian Forests
Yi Liu, Veronika Ro\v{c}kov\'a, Yuexi Wang

TL;DR
This paper introduces a Bayesian tree-based variable selection method that is consistent in high-dimensional non-linear models, and proposes an ABC sampling approach to efficiently identify important variables.
Contribution
It develops the first model selection consistency results for Bayesian forest priors and introduces ABC Bayesian Forests, a novel data-splitting ABC method for non-parametric variable selection.
Findings
Method is consistent for variable selection with p > n.
ABC Bayesian Forests achieve higher acceptance rates.
Successfully identifies variables with high marginal inclusion probabilities.
Abstract
Few problems in statistics are as perplexing as variable selection in the presence of very many redundant covariates. The variable selection problem is most familiar in parametric environments such as the linear model or additive variants thereof. In this work, we abandon the linear model framework, which can be quite detrimental when the covariates impact the outcome in a non-linear way, and turn to tree-based methods for variable selection. Such variable screening is traditionally done by pruning down large trees or by ranking variables based on some importance measure. Despite heavily used in practice, these ad-hoc selection rules are not yet well understood from a theoretical point of view. In this work, we devise a Bayesian tree-based probabilistic method and show that it is consistent for variable selection when the regression surface is a smooth mix of covariates. These…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
