What Functions Does XGBoost Learn?

Dohyeong Ki; Adityanand Guntuboyina

arXiv:2601.05444·math.ST·January 12, 2026

What Functions Does XGBoost Learn?

Dohyeong Ki, Adityanand Guntuboyina

PDF

Open Access

TL;DR

This paper provides a rigorous theoretical characterization of the function class learned by XGBoost, connecting it to classical variation measures and establishing near-optimal convergence rates.

Contribution

It introduces a new infinite-dimensional function class and complexity measure that explain XGBoost's implicit regularization and theoretical properties.

Findings

01

XGBoost optimizers are equivalent to penalized regression over a new function class.

02

The function class relates to Hardy--Krause variation, linking to classical mathematical concepts.

03

The estimator over this class achieves near-minimax convergence rates.

Abstract

This paper establishes a rigorous theoretical foundation for the function class implicitly learned by XGBoost, bridging the gap between its empirical success and our theoretical understanding. We introduce an infinite-dimensional function class $F_{\infty - ST}^{d, s}$ that extends finite ensembles of bounded-depth regression trees, together with a complexity measure $V_{\infty - XGB}^{d, s} (\cdot)$ that generalizes the $L^{1}$ regularization penalty used in XGBoost. We show that every optimizer of the XGBoost objective is also an optimizer of an equivalent penalized regression problem over $F_{\infty - ST}^{d, s}$ with penalty $V_{\infty - XGB}^{d, s} (\cdot)$ , providing an interpretation of XGBoost as implicitly targeting a broader function class. We also develop a smoothness-based interpretation of $F_{\infty - ST}^{d, s}$ and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning