TL;DR
This paper clarifies the differences between gradient and Newton boosting algorithms, introduces a hybrid variant, and demonstrates that Newton boosting generally achieves superior predictive accuracy across various datasets and loss functions.
Contribution
It unifies gradient, Newton, and hybrid boosting methods into a single framework and introduces a new tuning parameter for Newton boosting that improves accuracy.
Findings
Newton boosting outperforms gradient and hybrid boosting in accuracy
Faster convergence is not the reason for Newton boosting's superior performance
A new interpretable tuning parameter enhances Newton boosting effectiveness
Abstract
Boosting algorithms are frequently used in applied data science and in research. To date, the distinction between boosting with either gradient descent or second-order Newton updates is often not made in both applied and methodological research, and it is thus implicitly assumed that the difference is irrelevant. The goal of this article is to clarify this situation. In particular, we present gradient and Newton boosting, as well as a hybrid variant of the two, in a unified framework. We compare these boosting algorithms with trees as base learners using various datasets and loss functions. Our experiments show that Newton boosting outperforms gradient and hybrid gradient-Newton boosting in terms of predictive accuracy on the majority of datasets. We also present evidence that the reason for this is not faster convergence of Newton boosting. In addition, we introduce a novel tuning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
