Estimating the Operating Characteristics of Ensemble Methods

Anthony Gamst; Jay-Calvin Reyes; Alden Walker

arXiv:1710.08952·stat.ML·October 26, 2017

Estimating the Operating Characteristics of Ensemble Methods

Anthony Gamst, Jay-Calvin Reyes, Alden Walker

PDF

Open Access

TL;DR

This paper introduces a bootstrap-based technique to estimate the performance and variability of ensemble methods, enabling efficient analysis of their operating characteristics without extensive retraining.

Contribution

The paper presents a novel bootstrap approach for evaluating ensemble methods, particularly random forests, and explores alternative strategies for meta-parameter tuning to improve accuracy.

Findings

01

Bootstrap method effectively estimates ensemble performance.

02

Alternative meta-parameters can enhance predictive accuracy.

03

Technique reduces computational effort in model evaluation.

Abstract

In this paper we present a technique for using the bootstrap to estimate the operating characteristics and their variability for certain types of ensemble methods. Bootstrapping a model can require a huge amount of work if the training data set is large. Fortunately in many cases the technique lets us determine the effect of infinite resampling without actually refitting a single model. We apply the technique to the study of meta-parameter selection for random forests. We demonstrate that alternatives to bootstrap aggregation and to considering \sqrt{d} features to split each node, where d is the number of features, can produce improvements in predictive accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification · Anomaly Detection Techniques and Applications