Mondrian Forests: Efficient Online Random Forests
Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh

TL;DR
This paper introduces Mondrian forests, a new online random forest method that is computationally faster than existing online methods while maintaining comparable predictive accuracy, making it suitable for real-time applications.
Contribution
The paper proposes Mondrian forests, an online random forest algorithm based on Mondrian processes, offering similar accuracy to batch methods but with significantly improved computational efficiency.
Findings
Achieve competitive predictive performance with existing online random forests.
More than an order of magnitude faster than comparable methods.
Maintain the same distribution as batch Mondrian forests.
Abstract
Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics. Random forests achieve competitive predictive performance and are computationally efficient to train and test, making them excellent candidates for real-world prediction tasks. The most popular random forest variants (such as Breiman's random forest and extremely randomized trees) operate on batches of training data. Online methods are now in greater demand. Existing online random forests, however, require more training data than their batch counterpart to achieve comparable predictive performance. In this work, we use Mondrian processes (Roy and Teh, 2009) to construct ensembles of random decision trees we call Mondrian forests. Mondrian forests can be grown in an incremental/online fashion and remarkably, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification · Data Stream Mining Techniques
