TL;DR
This paper introduces DTSemNet, a novel invertible encoding for oblique decision trees that enables training with standard vanilla gradient descent, resulting in more accurate models and faster training times across classification, regression, and reinforcement learning tasks.
Contribution
DTSemNet provides a new semantically equivalent encoding for hard oblique decision trees, allowing the use of vanilla gradient descent for training, improving accuracy and efficiency over existing differentiable DT methods.
Findings
Oblique DTs trained with DTSemNet outperform similar-sized models in accuracy.
Training time for DTs is significantly reduced using DTSemNet.
DTSemNet can efficiently learn DT policies in reinforcement learning tasks.
Abstract
Decision Trees (DTs) constitute one of the major highly non-linear AI models, valued, e.g., for their efficiency on tabular data. Learning accurate DTs is, however, complicated, especially for oblique DTs, and does take a significant training time. Further, DTs suffer from overfitting, e.g., they proverbially "do not generalize" in regression tasks. Recently, some works proposed ways to make (oblique) DTs differentiable. This enables highly efficient gradient-descent algorithms to be used to learn DTs. It also enables generalizing capabilities by learning regressors at the leaves simultaneously with the decisions in the tree. Prior approaches to making DTs differentiable rely either on probabilistic approximations at the tree's internal nodes (soft DTs) or on approximations in gradient computation at the internal node (quantized gradient descent). In this work, we propose DTSemNet, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
