TL;DR
This paper reviews hyperparameters of random forests, discusses tuning strategies including model-based optimization, and introduces the tuneRanger R package for automatic hyperparameter tuning, demonstrating improved performance and efficiency.
Contribution
It provides a comprehensive review of RF hyperparameters, applies MBO tuning strategy, and introduces tuneRanger for automatic hyperparameter optimization.
Findings
Tuning hyperparameters can improve RF performance.
tuneRanger outperforms other tuning methods in benchmarks.
Default hyperparameters are often sufficient, but tuning yields better results.
Abstract
The random forest algorithm (RF) has several hyperparameters that have to be set by the user, e.g., the number of observations drawn randomly for each tree and whether they are drawn with or without replacement, the number of variables drawn randomly for each split, the splitting rule, the minimum number of samples that a node must contain and the number of trees. In this paper, we first provide a literature review on the parameters' influence on the prediction performance and on variable importance measures. It is well known that in most cases RF works reasonably well with the default values of the hyperparameters specified in software packages. Nevertheless, tuning the hyperparameters can improve the performance of RF. In the second part of this paper, after a brief overview of tuning strategies we demonstrate the application of one of the most established tuning strategies,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
