hi-RF: Incremental Learning Random Forest for large-scale multi-class Data Classification
Tingting Xie, Yuxing Peng, Changjian Wang

TL;DR
hi-RF introduces an incremental random forest method that adaptively updates or replaces trees to efficiently handle large-scale, multi-class data with evolving classes, balancing accuracy and computational cost.
Contribution
The paper presents a novel heterogeneous incremental learning method for random forests that adaptively updates or replaces trees to improve efficiency in large-scale, multi-class classification.
Findings
Achieves comparable accuracy with reduced computational time.
Effectively handles large-scale, multi-class data with evolving classes.
Balances precision and efficiency through novel out-of-bag estimation techniques.
Abstract
In recent years, dynamically growing data and incrementally growing number of classes pose new challenges to large-scale data classification research. Most traditional methods struggle to balance the precision and computational burden when data and its number of classes increased. However, some methods are with weak precision, and the others are time-consuming. In this paper, we propose an incremental learning method, namely, heterogeneous incremental Nearest Class Mean Random Forest (hi-RF), to handle this issue. It is a heterogeneous method that either replaces trees or updates trees leaves in the random forest adaptively, to reduce the computational time in comparable performance, when data of new classes arrive. Specifically, to keep the accuracy, one proportion of trees are replaced by new NCM decision trees; to reduce the computational load, the rest trees are updated their leaves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Domain Adaptation and Few-Shot Learning · Data Stream Mining Techniques
