flexBART: Flexible Bayesian regression trees with categorical predictors
Sameer K. Deshpande

TL;DR
flexBART introduces a flexible Bayesian regression tree method that effectively models categorical predictors and spatially contiguous regions, improving predictive accuracy and scalability over traditional BART implementations.
Contribution
The paper presents a novel re-implementation of BART allowing multi-level splits for categorical variables and a new prior for spatially contiguous regions, enhancing modeling flexibility.
Findings
Improved out-of-sample predictive performance
Better scalability to large datasets
Enhanced modeling of spatially contiguous regions
Abstract
Most implementations of Bayesian additive regression trees (BART) one-hot encode categorical predictors, replacing each one with several binary indicators, one for every level or category. Regression trees built with these indicators partition the discrete set of categorical levels by repeatedly removing one level at a time. Unfortunately, the vast majority of partitions cannot be built with this strategy, severely limiting BART's ability to partially pool data across groups of levels. Motivated by analyses of baseball data and neighborhood-level crime dynamics, we overcame this limitation by re-implementing BART with regression trees that can assign multiple levels to both branches of a decision tree node. To model spatial data aggregated into small regions, we further proposed a new decision rule prior that creates spatially contiguous regions by deleting a random edge from a random…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Data Analysis with R · Statistical Methods and Bayesian Inference
MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Linear Layer · Layer Normalization · Adam · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Softmax · Byte Pair Encoding
