Regularized impurity reduction: Accurate decision trees with complexity   guarantees

Guangyi Zhang; Aristides Gionis

arXiv:2208.10949·cs.LG·November 29, 2022·1 cites

Regularized impurity reduction: Accurate decision trees with complexity guarantees

Guangyi Zhang, Aristides Gionis

PDF

Open Access 1 Repo

TL;DR

This paper introduces a decision tree induction algorithm with theoretical complexity guarantees, balancing accuracy and interpretability by optimizing impurity reduction and test selection.

Contribution

It proposes a simple enhancement to traditional impurity-based methods, providing logarithmic approximation guarantees on tree complexity under broad settings.

Findings

01

Enhanced algorithms achieve better balance between accuracy and interpretability.

02

The proposed method provides theoretical complexity guarantees.

03

Empirical results show improved tree simplicity without sacrificing accuracy.

Abstract

Decision trees are popular classification models, providing high accuracy and intuitive explanations. However, as the tree size grows the model interpretability deteriorates. Traditional tree-induction algorithms, such as C4.5 and CART, rely on impurity-reduction functions that promote the discriminative power of each split. Thus, although these traditional methods are accurate in practice, there has been no theoretical guarantee that they will produce small trees. In this paper, we justify the use of a general family of impurity functions, including the popular functions of entropy and Gini-index, in scenarios where small trees are desirable, by showing that a simple enhancement can equip them with complexity guarantees. We consider a general setting, where objects to be classified are drawn from an arbitrary probability distribution, classification can be binary or multi-class, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

guangyi-zhang/low-expected-cost-decision-trees
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification

MethodsTest