Decision trees compensate for model misspecification

Hugh Panton; Gavin Leech; Laurence Aitchison

arXiv:2302.04081·stat.ML·February 9, 2023·1 cites

Decision trees compensate for model misspecification

Hugh Panton, Gavin Leech, Laurence Aitchison

PDF

Open Access

TL;DR

This paper investigates how decision trees and gradient boosting machines perform well even when the true data interactions are absent, highlighting their robustness to model misspecification and proposing methods for robust generalized linear models.

Contribution

It confirms hypotheses about the role of tree depth in performance without true interactions and introduces two methods for robust generalized linear models.

Findings

01

Decision trees are robust to model misspecification.

02

Tree depth influences performance beyond true interactions.

03

Proposed methods improve robustness of generalized linear models.

Abstract

The best-performing models in ML are not interpretable. If we can explain why they outperform, we may be able to replicate these mechanisms and obtain both interpretability and performance. One example are decision trees and their descendent gradient boosting machines (GBMs). These perform well in the presence of complex interactions, with tree depth governing the order of interactions. However, interactions cannot fully account for the depth of trees found in practice. We confirm 5 alternative hypotheses about the role of tree depth in performance in the absence of true interactions, and present results from experiments on a battery of datasets. Part of the success of tree models is due to their robustness to various forms of mis-specification. We present two methods for robust generalized linear models (GLMs) addressing the composite and mixed response scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning