On the Computational Efficiency of Bayesian Additive Regression Trees: An Asymptotic Analysis

Yan Shuo Tan; Omer Ronen; Theo Saarinen; Bin Yu

arXiv:2406.19958·stat.ML·February 10, 2026·1 cites

On the Computational Efficiency of Bayesian Additive Regression Trees: An Asymptotic Analysis

Yan Shuo Tan, Omer Ronen, Theo Saarinen, Bin Yu

PDF

Open Access 1 Repo

TL;DR

This paper analyzes the asymptotic computational efficiency of the Bayesian Additive Regression Trees (BART) sampler, revealing how convergence time scales with data size and proposing modifications to improve performance.

Contribution

It provides the first asymptotic analysis of the BART sampler's convergence behavior and suggests practical strategies to enhance its computational efficiency with large datasets.

Findings

01

Convergence time increases with sample size due to multi-modality.

02

Increasing the number of trees or temperature reduces convergence time.

03

Default BART sampler's convergence trend is robust across various settings.

Abstract

Bayesian Additive Regression Trees (BART) is a popular Bayesian non-parametric regression model that is commonly used in causal inference and beyond. Its strong predictive performance is supported by well-developed estimation theory, comprising guarantees that its posterior distribution concentrates around the true regression function at optimal rates under various data generative settings and for appropriate prior choices. However, the computational properties of the widely-used BART sampler proposed by Chipman et al. (2010) are yet to be well-understood. In this paper, we perform an asymptotic analysis of a slightly modified version of the default BART sampler when fitted to data-generating processes with discrete covariates. We show that the sampler's time to convergence, evaluated in terms of the hitting time of a high posterior density set, increases with the number of training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

theo-s/bart-hitting-time-sims
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · Softmax · Layer Normalization · Byte Pair Encoding · Dropout · Adam · Linear Layer · Dense Connections