Posterior Concentration for Bayesian Regression Trees and Forests

Veronika Rockova; Stephanie van der Pas

arXiv:1708.08734·math.ST·June 17, 2019

Posterior Concentration for Bayesian Regression Trees and Forests

Veronika Rockova, Stephanie van der Pas

PDF

TL;DR

This paper provides theoretical insights into Bayesian regression trees and forests, demonstrating their ability to adaptively recover smooth functions and avoid overfitting through posterior concentration analysis.

Contribution

It introduces a spike-and-tree prior for Bayesian CART and proves optimal convergence rates, explaining the practical success of Bayesian trees.

Findings

01

Bayesian trees achieve near-optimal recovery of smooth functions.

02

They adapt to unknown smoothness levels.

03

They perform effective dimension reduction in high-dimensional settings.

Abstract

Since their inception in the 1980's, regression trees have been one of the more widely used non-parametric prediction methods. Tree-structured methods yield a histogram reconstruction of the regression surface, where the bins correspond to terminal nodes of recursive partitioning. Trees are powerful, yet susceptible to over-fitting. Strategies against overfitting have traditionally relied on pruning greedily grown trees. The Bayesian framework offers an alternative remedy against overfitting through priors. Roughly speaking, a good prior charges smaller trees where overfitting does not occur. While the consistency of random histograms, trees and their ensembles has been studied quite extensively, the theoretical understanding of the Bayesian counterparts has been missing. In this paper, we take a step towards understanding why/when do Bayesian trees and their ensembles not overfit. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.