An Eager Splitting Strategy for Online Decision Trees
Chaitanya Manapragada, Heitor M Gomes, Mahsa Salehi, Albert, Bifet, Geoffrey I Webb

TL;DR
This paper introduces an eager splitting strategy called Hoeffding AnyTime Tree (HATT) for online decision trees, demonstrating its superior performance over traditional Hoeffding Tree in ensemble learning scenarios on various datasets.
Contribution
The paper proposes HATT, a new online decision tree method that converges to the ideal batch tree, and shows it outperforms Hoeffding Tree in ensemble settings.
Findings
HATT outperforms Hoeffding Tree in ensemble learning on multiple datasets.
HATT converges to the ideal batch tree, unlike Hoeffding Tree.
HATT is a more effective base learner for online bagging and boosting.
Abstract
Decision tree ensembles are widely used in practice. In this work, we study in ensemble settings the effectiveness of replacing the split strategy for the state-of-the-art online tree learner, Hoeffding Tree, with a rigorous but more eager splitting strategy that we had previously published as Hoeffding AnyTime Tree. Hoeffding AnyTime Tree (HATT), uses the Hoeffding Test to determine whether the current best candidate split is superior to the current split, with the possibility of revision, while Hoeffding Tree aims to determine whether the top candidate is better than the second best and if a test is selected, fixes it for all posterity. HATT converges to the ideal batch tree while Hoeffding Tree does not. We find that HATT is an efficacious base learner for online bagging and online boosting ensembles. On UCI and synthetic streams, HATT as a base learner outperforms HT within a 0.05…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Imbalanced Data Classification Techniques · Machine Learning and Data Classification
