Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers
Abraham J. Wyner, Matthew Olson, Justin Bleich, David Mease

TL;DR
This paper proposes a unified explanation for AdaBoost and random forests, suggesting both work as interpolating classifiers that create a 'spikey-smooth' decision boundary, challenging traditional views on regularization and model complexity.
Contribution
It introduces a novel perspective that explains the success of both AdaBoost and random forests as self-averaging, interpolating algorithms rather than traditional optimization or regularization-based methods.
Findings
Both AdaBoost and random forests achieve similar accuracy.
Random forests are self-averaging, interpolating classifiers.
Regularization and early stopping are not necessary for boosting.
Abstract
There is a large literature explaining why AdaBoost is a successful classifier. The literature on AdaBoost focuses on classifier margins and boosting's interpretation as the optimization of an exponential likelihood function. These existing explanations, however, have been pointed out to be incomplete. A random forest is another popular ensemble method for which there is substantially less explanation in the literature. We introduce a novel perspective on AdaBoost and random forests that proposes that the two algorithms work for similar reasons. While both classifiers achieve similar predictive accuracy, random forests cannot be conceived as a direct optimization procedure. Rather, random forests is a self-averaging, interpolating algorithm which creates what we denote as a "spikey-smooth" classifier, and we view AdaBoost in the same light. We conjecture that both AdaBoost and random…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Neural Networks and Applications
