TL;DR
This paper introduces BAGofT, a new goodness-of-fit test for classification methods that does not rely on parametric assumptions, effectively detecting underfitting in both parametric and non-parametric models.
Contribution
The paper proposes BAGofT, a novel data-splitting methodology for assessing the fit of any classification procedure, addressing a gap in existing goodness-of-fit testing methods.
Findings
BAGofT effectively detects underfitting in classification models.
The method controls test size and has high power with increasing sample size.
Simulation studies demonstrate its advantages over existing methods.
Abstract
In recent years, many non-traditional classification methods, such as Random Forest, Boosting, and neural network, have been widely used in applications. Their performance is typically measured in terms of classification accuracy. While the classification error rate and the like are important, they do not address a fundamental question: Is the classification method underfitted? To our best knowledge, there is no existing method that can assess the goodness-of-fit of a general classification procedure. Indeed, the lack of a parametric assumption makes it challenging to construct proper tests. To overcome this difficulty, we propose a methodology called BAGofT that splits the data into a training set and a validation set. First, the classification procedure to assess is applied to the training set, which is also used to adaptively find a data grouping that reveals the most severe regions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest
