Provable guarantees for decision tree induction: the agnostic setting

Guy Blanc; Jane Lange; Li-Yang Tan

arXiv:2006.00743·cs.DS·June 2, 2020

Provable guarantees for decision tree induction: the agnostic setting

Guy Blanc, Jane Lange, Li-Yang Tan

PDF

1 Video

TL;DR

This paper provides theoretical guarantees for top-down decision tree algorithms in the agnostic setting, showing they can approximate the optimal tree error with provable bounds, unlike prior work focused on realizable cases.

Contribution

It establishes the first provable performance guarantees for top-down decision tree heuristics in the agnostic setting, including upper and near-matching lower bounds.

Findings

01

Decision trees of size s^{ ilde{O}(( ext{log} s)/ ext{ε}^2)} achieve error close to the optimal.

02

Prior guarantees did not exist for agnostic decision tree learning.

03

A near-matching lower bound of s^{ ilde{ ext{Omega}}( ext{log} s)} is shown.

Abstract

We give strengthened provable guarantees on the performance of widely employed and empirically successful {\sl top-down decision tree learning heuristics}. While prior works have focused on the realizable setting, we consider the more realistic and challenging {\sl agnostic} setting. We show that for all monotone functions~ $f$ and parameters $s \in N$ , these heuristics construct a decision tree of size $s^{\tilde{O} ((l o g s) / ε^{2})}$ that achieves error $\leq opt_{s} + ε$ , where $opt_{s}$ denotes the error of the optimal size- $s$ decision tree for $f$ . Previously, such a guarantee was not known to be achievable by any algorithm, even one that is not based on top-down heuristics. We complement our algorithmic guarantee with a near-matching $s^{\tilde{Ω} (l o g s)}$ lower bound.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Provable guarantees for decision tree induction: the agnostic setting· slideslive