Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles
Dhruv Mahajan, Vivek Gupta, S Sathiya Keerthi, Sellamanickam, Sundararajan, Shravan Narayanamurthy, Rahul Kidambi

TL;DR
This paper introduces an efficient method to estimate the generalization error and bias-variance components of ensemble classifiers by modeling the variance of base classifier scores with a beta distribution, aiding in ensemble design.
Contribution
It proposes a novel approach to estimate generalization error using beta distribution modeling of classifier score variance, enabling better ensemble parameter tuning.
Findings
Accurate estimation of generalization error using small samples.
Effective guidance for choosing ensemble size and training data subset.
Potential for designing distributed ensemble classifiers.
Abstract
For many applications, an ensemble of base classifiers is an effective solution. The tuning of its parameters(number of classes, amount of data on which each classifier is to be trained on, etc.) requires G, the generalization error of a given ensemble. The efficient estimation of G is the focus of this paper. The key idea is to approximate the variance of the class scores/probabilities of the base classifiers over the randomness imposed by the training subset by normal/beta distribution at each point x in the input feature space. We estimate the parameters of the distribution using a small set of randomly chosen base classifiers and use those parameters to give efficient estimation schemes for G. We give empirical evidence for the quality of the various estimators. We also demonstrate their usefulness in making design choices such as the number of classifiers in the ensemble and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Advanced Statistical Methods and Models · Neural Networks and Applications
