Better-Than-Chance Classification for Signal Detection

Jonathan D. Rosenblatt; Yuval Benjamini; Roee Gilron; Roy Mukamel,; Jelle J. Goeman

arXiv:1608.08873·stat.ME·January 28, 2020

Better-Than-Chance Classification for Signal Detection

Jonathan D. Rosenblatt, Yuval Benjamini, Roee Gilron, Roy Mukamel,, Jelle J. Goeman

PDF

TL;DR

This paper critically evaluates the effectiveness of using classifier accuracy as a statistical test for signal detection, showing it is often underpowered and proposing alternative methods for better detection.

Contribution

The paper demonstrates that accuracy-based tests are less powerful than multivariate tests for detecting differences, and suggests improvements like Leave-One-Out Bootstrap for classifier evaluation.

Findings

01

Accuracy-based tests have lower detection power than multivariate tests.

02

Discrete nature and regularization reduce accuracy test effectiveness.

03

Leave-One-Out Bootstrap improves classifier evaluation power.

Abstract

The estimated accuracy of a classifier is a random quantity with variability. A common practice in supervised machine learning, is thus to test if the estimated accuracy is significantly better than chance level. This method of signal detection is particularly popular in neuroimaging and genetics. We provide evidence that using a classifier's accuracy as a test statistic can be an underpowered strategy for finding differences between populations, compared to a bona-fide statistical test. It is also computationally more demanding than a statistical test. Via simulation, we compare test statistics that are based on classification accuracy, to others based on multivariate test statistics. We find that probability of detecting differences between two distributions is lower for accuracy based statistics. We examine several candidate causes for the low power of accuracy tests. These causes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.