On Theory for BART
Veronika Rockova, Enakshi Saha

TL;DR
This paper provides a theoretical foundation for BART, a popular Bayesian ensemble method using trees, establishing its optimal posterior convergence rates through branching process analysis.
Contribution
It analyzes the exact BART prior, proposes a modification for optimality, and derives tail bounds and convergence rates using branching process theory.
Findings
BART's prior can be modified for optimality.
Tail bounds for heterogeneous Galton-Watson processes are established.
BART achieves the optimal rate of posterior convergence.
Abstract
Ensemble learning is a statistical paradigm built on the premise that many weak learners can perform exceptionally well when deployed collectively. The BART method of Chipman et al. (2010) is a prominent example of Bayesian ensemble learning, where each learner is a tree. Due to its impressive performance, BART has received a lot of attention from practitioners. Despite its wide popularity, however, theoretical studies of BART have begun emerging only very recently. Laying the foundations for the theoretical analysis of Bayesian forests, Rockova and van der Pas (2017) showed optimal posterior concentration under conditionally uniform tree priors. These priors deviate from the actual priors implemented in BART. Here, we study the exact BART prior and propose a simple modification so that it also enjoys optimality properties. To this end, we dive into branching process theory. We obtain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
