Malware families discovery via Open-Set Recognition on Android manifest permissions
Filippo Leveni, Matteo Mistura, Francesco Iubatti, Carmine Giangregorio, Nicol\`o Pastore, Cesare Alippi, Giacomo Boracchi

TL;DR
This paper presents a practical malware family classification system for Android that combines open-set recognition with gradient boosting to detect both known and new malware families efficiently.
Contribution
It introduces a novel combination of MaxLogit open-set recognition with gradient boosting for malware classification, addressing the challenge of identifying new malware families.
Findings
Effective detection of known malware families.
Successful identification of new malware families.
Low computational overhead in practical deployment.
Abstract
Malware are malicious programs that are grouped into families based on their penetration technique, source code, and other characteristics. Classifying malware programs into their respective families is essential for building effective defenses against cyber threats. Machine learning models have a huge potential in malware detection on mobile devices, as malware families can be recognized by classifying permission data extracted from Android manifest files. Still, the malware classification task is challenging due to the high-dimensional nature of permission data and the limited availability of training samples. In particular, the steady emergence of new malware families makes it impossible to acquire a comprehensive training set covering all the malware classes. In this work, we present a malware classification system that, on top of classifying known malware, detects new ones. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Digital and Cyber Forensics · Software Engineering Research
MethodsSparse Evolutionary Training
