Alpha-Trimming: Locally Adaptive Tree Pruning for Random Forests
Nikola Surjanovic, Andrew Henrey, Thomas M. Loughin

TL;DR
This paper introduces alpha-trimming, a novel adaptive tree pruning method for random forests that enhances predictive accuracy by selectively pruning trees based on local signal-to-noise ratios, without refitting.
Contribution
It proposes a fast, locally adaptive pruning algorithm for random forests that improves performance and allows tuning without refitting trees.
Findings
Significant reduction in mean squared prediction error on benchmark datasets.
Pruning adapts to local data characteristics, improving overall model accuracy.
The method maintains or improves performance compared to fully grown trees.
Abstract
We demonstrate that adaptively controlling the size of individual regression trees in a random forest can improve predictive performance, contrary to the conventional wisdom that trees should be fully grown. A fast pruning algorithm, alpha-trimming, is proposed as an effective approach to pruning trees within a random forest, where more aggressive pruning is performed in regions with a low signal-to-noise ratio. The amount of overall pruning is controlled by adjusting the weight on an information criterion penalty as a tuning parameter, with the standard random forest being a special case of our alpha-trimmed random forest. A remarkable feature of alpha-trimming is that its tuning parameter can be adjusted without refitting the trees in the random forest once the trees have been fully grown once. In a benchmark suite of 46 example data sets, mean squared prediction error is often…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Data Mining Algorithms and Applications · Machine Learning and Data Classification
MethodsPruning
