The Heterogeneous Ensembles of Standard Classification Algorithms (HESCA): the Whole is Greater than the Sum of its Parts
James Large, Jason Lines, Anthony Bagnall

TL;DR
This paper introduces HESCA and HESCA+, ensemble methods combining diverse classifiers, which outperform individual models and state-of-the-art classifiers across numerous datasets, especially with small training sets and multiple classes.
Contribution
The paper presents HESCA and HESCA+, novel heterogeneous ensemble methods that leverage multiple classifier families to improve predictive performance.
Findings
HESCA outperforms individual classifiers and tuned SVMs on various metrics.
HESCA+ further improves accuracy by including deep neural networks and decision forests.
Ensembles are especially effective with small training data and multi-class problems.
Abstract
Building classification models is an intrinsically practical exercise that requires many design decisions prior to deployment. We aim to provide some guidance in this decision making process. Specifically, given a classification problem with real valued attributes, we consider which classifier or family of classifiers should one use. Strong contenders are tree based homogeneous ensembles, support vector machines or deep neural networks. All three families of model could claim to be state-of-the-art, and yet it is not clear when one is preferable to the others. Our extensive experiments with over 200 data sets from two distinct archives demonstrate that, rather than choose a single family and expend computing resources on optimising that model, it is significantly better to build simpler versions of classifiers from each family and ensemble. We show that the Heterogeneous Ensembles of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Machine Learning and Data Classification · Anomaly Detection Techniques and Applications
