Fully-Dynamic Decision Trees
Marco Bressan, Gabriel Damay, Mauro Sozio

TL;DR
This paper introduces the first fully dynamic decision tree algorithm that efficiently maintains near-optimal splits over arbitrary insertions and deletions, with strong theoretical guarantees and practical effectiveness.
Contribution
It presents a novel fully dynamic decision tree algorithm with provable guarantees on split quality, running time, and space, improving upon previous static or semi-dynamic methods.
Findings
Algorithm guarantees split quality within epsilon of optimal at all times.
Achieves improved amortized running time for real-valued and binary/categorical features.
Experimental results demonstrate the algorithm's practical effectiveness on real-world data.
Abstract
We develop the first fully dynamic algorithm that maintains a decision tree over an arbitrary sequence of insertions and deletions of labeled examples. Given our algorithm guarantees that, at every point in time, every node of the decision tree uses a split with Gini gain within an additive of the optimum. For real-valued features the algorithm has an amortized running time per insertion/deletion of , which improves to for binary or categorical features, while it uses space , where is the maximum number of examples at any point in time and is the number of features. Our algorithm is nearly optimal, as we show that any algorithm with similar guarantees uses amortized running time and space . We complement our theoretical results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsData Mining Algorithms and Applications · Machine Learning and Data Classification · Imbalanced Data Classification Techniques
