Making Sense of Random Forest Probabilities: a Kernel Perspective

Matthew A. Olson; Abraham J. Wyner

arXiv:1812.05792·stat.ML·December 17, 2018·22 cites

Making Sense of Random Forest Probabilities: a Kernel Perspective

Matthew A. Olson, Abraham J. Wyner

PDF

Open Access

TL;DR

This paper links random forest probability estimation to kernel regression, providing a statistically grounded approach and offering insights for tuning to improve probability accuracy.

Contribution

It establishes a kernel perspective on random forest probabilities, connecting them to kernel regression and guiding better tuning practices.

Findings

01

Random forests can be interpreted through a proximity kernel lens.

02

The kernel perspective clarifies the geometry and sparsity in probability estimation.

03

Recommendations for tuning random forests to enhance probability estimates.

Abstract

A random forest is a popular tool for estimating probabilities in machine learning classification tasks. However, the means by which this is accomplished is unprincipled: one simply counts the fraction of trees in a forest that vote for a certain class. In this paper, we forge a connection between random forests and kernel regression. This places random forest probability estimation on more sound statistical footing. As part of our investigation, we develop a model for the proximity kernel and relate it to the geometry and sparsity of the estimation problem. We also provide intuition and recommendations for tuning a random forest to improve its probability estimates.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification · Face and Expression Recognition