Statistical comparison of classifiers through Bayesian hierarchical   modelling

Giorgio Corani; Alessio Benavoli; Janez Dem\v{s}ar; Francesca; Mangili; Marco Zaffalon

arXiv:1609.08905·cs.LG·November 23, 2016·1 cites

Statistical comparison of classifiers through Bayesian hierarchical modelling

Giorgio Corani, Alessio Benavoli, Janez Dem\v{s}ar, Francesca, Mangili, Marco Zaffalon

PDF

Open Access 1 Repo

TL;DR

This paper introduces a Bayesian hierarchical model for comparing classifiers across multiple datasets, providing more reliable probability estimates of their equivalence or difference than traditional null hypothesis tests.

Contribution

It presents a novel Bayesian hierarchical approach that jointly analyzes cross-validation results, reducing estimation error and overcoming limitations of NHST in classifier comparison.

Findings

01

Reduces estimation error compared to traditional methods

02

Provides posterior probabilities of classifier equivalence or difference

03

Improves reliability of classifier comparison across datasets

Abstract

Usually one compares the accuracy of two competing classifiers via null hypothesis significance tests (nhst). Yet the nhst tests suffer from important shortcomings, which can be overcome by switching to Bayesian hypothesis testing. We propose a Bayesian hierarchical model which jointly analyzes the cross-validation results obtained by two classifiers on multiple data sets. It returns the posterior probability of the accuracies of the two classifiers being practically equivalent or significantly different. A further strength of the hierarchical model is that, by jointly analyzing the results obtained on all data sets, it reduces the estimation error compared to the usual approach of averaging the cross-validation results obtained on a given data set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

BayesianTestsML/tutorial
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Neural Networks and Applications · Advanced Statistical Methods and Models