On the Current State of Research in Explaining Ensemble Performance Using Margins
Waldyn Martinez, J. Brian Gray

TL;DR
This paper reviews and empirically tests current theories on how margins influence ensemble classifier performance, evaluating whether larger margins correlate with better generalization across various datasets.
Contribution
It introduces new techniques to analyze margin-based explanations and empirically assesses their validity using experiments on real and simulated data.
Findings
Larger margins tend to correlate with lower generalization error.
Increasing mean and decreasing variance of margins can improve ensemble performance.
Empirical results support some theoretical bounds but also highlight limitations.
Abstract
Empirical evidence shows that ensembles, such as bagging, boosting, random and rotation forests, generally perform better in terms of their generalization error than individual classifiers. To explain this performance, Schapire et al. (1998) developed an upper bound on the generalization error of an ensemble based on the margins of the training data, from which it was concluded that larger margins should lead to lower generalization error, everything else being equal. Many other researchers have backed this assumption and presented tighter bounds on the generalization error based on either the margins or functions of the margins. For instance, Shen and Li (2010) provide evidence suggesting that the generalization error of a voting classifier might be reduced by increasing the mean and decreasing the variance of the margins. In this article we propose several techniques and empirically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Neural Networks and Applications
