On Ensemble Learning
Mark Stamp, Aniket Chandak, Gavin Wong, Allen Ye

TL;DR
This paper introduces a unified framework for ensemble classifiers, surveys their use in malware analysis, and provides extensive empirical comparisons to clarify their effectiveness across different datasets.
Contribution
It offers a comprehensive categorization framework for ensemble classifiers and presents the first direct comparison of their performance in malware detection.
Findings
Ensemble techniques improve malware detection accuracy.
Different ensemble methods perform variably across datasets.
A unified framework aids in understanding and comparing ensemble methods.
Abstract
In this paper, we consider ensemble classifiers, that is, machine learning based classifiers that utilize a combination of scoring functions. We provide a framework for categorizing such classifiers, and we outline several ensemble techniques, discussing how each fits into our framework. From this general introduction, we then pivot to the topic of ensemble learning within the context of malware analysis. We present a brief survey of some of the ensemble techniques that have been used in malware (and related) research. We conclude with an extensive set of experiments, where we apply ensemble techniques to a large and challenging malware dataset. While many of these ensemble techniques have appeared in the malware literature, previously there has been no way to directly compare results such as these, as different datasets and different measures of success are typically used. Our common…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Anomaly Detection Techniques and Applications
