Small Sample Inference for Generalization Error in Classification Using   the CUD Bound

Eric B. Laber; Susan A. Murphy

arXiv:1206.3274·cs.LG·June 18, 2012

Small Sample Inference for Generalization Error in Classification Using the CUD Bound

Eric B. Laber, Susan A. Murphy

PDF

Open Access

TL;DR

This paper introduces a new method for constructing confidence sets for the generalization error in classification tasks with small samples, using a smooth upper bound and bootstrap, outperforming traditional resampling methods.

Contribution

It proposes a novel confidence set construction based on a smooth upper bound and bootstrap, addressing the non-normality issue in small sample generalization error estimation.

Findings

01

Outperforms traditional resampling methods in small sample scenarios

02

Provides a computationally efficient algorithm for parametric additive models

03

Demonstrates superior performance on test and simulated datasets

Abstract

Confidence measures for the generalization error are crucial when small training samples are used to construct classifiers. A common approach is to estimate the generalization error by resampling and then assume the resampled estimator follows a known distribution to form a confidence set [Kohavi 1995, Martin 1996,Yang 2006]. Alternatively, one might bootstrap the resampled estimator of the generalization error to form a confidence set. Unfortunately, these methods do not reliably provide sets of the desired confidence. The poor performance appears to be due to the lack of smoothness of the generalization error as a function of the learned classifier. This results in a non-normal distribution of the estimated generalization error. We construct a confidence set for the generalization error by use of a smooth upper bound on the deviation between the resampled estimate and generalization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Neural Networks and Applications · Machine Learning and Data Classification