Bayesian Decision Trees via Tractable Priors and Probabilistic Context-Free Grammars
Colin Sullivan, Mo Tiwari, Sebastian Thrun, Chris Piech

TL;DR
This paper introduces BCART-PCFG, a Bayesian decision tree method that efficiently samples from the posterior using probabilistic context-free grammars, resulting in more stable, smaller, and competitive trees compared to traditional greedy methods.
Contribution
The paper proposes a novel criterion and a probabilistic context-free grammar approach for Bayesian decision trees, enabling efficient sampling and MAP estimation with polynomial time complexity.
Findings
Trees sampled via BCART-PCFG are smaller by up to 20x.
Achieves classification accuracy comparable or better than greedy trees.
Sampling is efficient and scalable to dataset size.
Abstract
Decision Trees are some of the most popular machine learning models today due to their out-of-the-box performance and interpretability. Often, Decision Trees models are constructed greedily in a top-down fashion via heuristic search criteria, such as Gini impurity or entropy. However, trees constructed in this manner are sensitive to minor fluctuations in training data and are prone to overfitting. In contrast, Bayesian approaches to tree construction formulate the selection process as a posterior inference problem; such approaches are more stable and provide greater theoretical guarantees. However, generating Bayesian Decision Trees usually requires sampling from complex, multimodal posterior distributions. Current Markov Chain Monte Carlo-based approaches for sampling Bayesian Decision Trees are prone to mode collapse and long mixing times, which makes them impractical. In this paper,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Topic Modeling · Natural Language Processing Techniques
