Comparison of 14 different families of classification algorithms on 115 binary datasets
Jacques Wainer

TL;DR
This study systematically compares 14 classification algorithms across 115 binary datasets, finding that random forest, gradient boosting, and RBF SVM perform similarly with negligible practical differences, and RBF SVM is the fastest.
Contribution
It provides a comprehensive empirical comparison of diverse classifiers on real datasets, highlighting practical equivalences and efficiency considerations.
Findings
Random forest, gbm, and RBF SVM are statistically similar in performance.
A change of less than 0.0112 in error rate is practically irrelevant.
RBF SVM is the fastest classifier in training and testing.
Abstract
We tested 14 very different classification algorithms (random forest, gradient boosting machines, SVM - linear, polynomial, and RBF - 1-hidden-layer neural nets, extreme learning machines, k-nearest neighbors and a bagging of knn, naive Bayes, learning vector quantization, elastic net logistic regression, sparse linear discriminant analysis, and a boosting of linear classifiers) on 115 real life binary datasets. We followed the Demsar analysis and found that the three best classifiers (random forest, gbm and RBF SVM) are not significantly different from each other. We also discuss that a change of less then 0.0112 in the error rate should be considered as an irrelevant change, and used a Bayesian ANOVA analysis to conclude that with high probability the differences between these three classifiers is not of practical consequence. We also verified the execution time of "standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Neural Networks and Applications · Face and Expression Recognition
MethodsSupport Vector Machine
