Block-regularized 5$\times$2 Cross-validated McNemar's Test for Comparing Two Classification Algorithms
Jing Yang, Ruibo Wang, Yijun Song, and Jihong Li

TL;DR
This paper introduces a new 5×2 cross-validated McNemar's test using block regularization to improve the power and stability of comparing two classification algorithms, validated on simulated and real data.
Contribution
It proposes a novel 5×2 BCV McNemar's test that compresses contingency tables for better error rate estimation in algorithm comparison.
Findings
Demonstrates reasonable type I error control.
Shows improved test power on multiple datasets.
Validates effectiveness through simulations and real-world data.
Abstract
In the task of comparing two classification algorithms, the widely-used McNemar's test aims to infer the presence of a significant difference between the error rates of the two classification algorithms. However, the power of the conventional McNemar's test is usually unpromising because the hold-out (HO) method in the test merely uses a single train-validation split that usually produces a highly varied estimation of the error rates. In contrast, a cross-validation (CV) method repeats the HO method in multiple times and produces a stable estimation. Therefore, a CV method has a great advantage to improve the power of McNemar's test. Among all types of CV methods, a block-regularized 52 CV (BCV) has been shown in many previous studies to be superior to the other CV methods in the comparison task of algorithms because the 52 BCV can produce a high-quality estimator of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Multi-Criteria Decision Making
MethodsTest
