Bridging Breiman's Brook: From Algorithmic Modeling to Statistical Learning
Lucas Mentch, Giles Hooker

TL;DR
This paper discusses the merging of data modeling and algorithmic modeling cultures, highlighting recent advances in random forest theory that exemplify this integration and exploring future research directions.
Contribution
It reviews recent developments in random forest theory that bridge the gap between algorithmic and statistical modeling approaches, emphasizing the importance of statistical analysis.
Findings
Recent advances in random forest understanding have integrated algorithmic and statistical perspectives.
The blending of modeling approaches exposes limitations of prediction-first philosophies.
Statistical analysis remains crucial despite algorithmic successes.
Abstract
In 2001, Leo Breiman wrote of a divide between "data modeling" and "algorithmic modeling" cultures. Twenty years later this division feels far more ephemeral, both in terms of assigning individuals to camps, and in terms of intellectual boundaries. We argue that this is largely due to the "data modelers" incorporating algorithmic methods into their toolbox, particularly driven by recent developments in the statistical understanding of Breiman's own Random Forest methods. While this can be simplistically described as "Breiman won", these same developments also expose the limitations of the prediction-first philosophy that he espoused, making careful statistical analysis all the more important. This paper outlines these exciting recent developments in the random forest literature which, in our view, occurred as a result of a necessary blending of the two ways of thinking Breiman…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification · Data Mining Algorithms and Applications
