Multi-label Classification for Android Malware Based on Active Learning

Qijing Qiao; Ruitao Feng; Sen Chen; Fei Zhang; Xiaohong Li

arXiv:2410.06444·cs.CR·October 10, 2024

Multi-label Classification for Android Malware Based on Active Learning

Qijing Qiao, Ruitao Feng, Sen Chen, Fei Zhang, Xiaohong Li

PDF

1 Repo

TL;DR

This paper introduces MLCDroid, a multi-label classification system for Android malware that identifies multiple malicious behaviors, utilizing active learning and data augmentation to improve accuracy and provide detailed malware analysis.

Contribution

It presents the first multi-label Android malware classification approach that detects multiple malicious behaviors and employs active learning to enhance accuracy with limited labeled data.

Findings

01

Achieved up to 73.3% effectiveness with algorithm comparison.

02

Improved accuracy to 86.7% using active learning and data augmentation.

03

Constructed a labeled dataset of six malicious behaviors from real-world malware.

Abstract

The existing malware classification approaches (i.e., binary and family classification) can barely benefit subsequent analysis with their outputs. Even the family classification approaches suffer from lacking a formal naming standard and an incomplete definition of malicious behaviors. More importantly, the existing approaches are powerless for one malware with multiple malicious behaviors, while this is a very common phenomenon for Android malware in the wild. So, neither of them can provide researchers with a direct and comprehensive enough understanding of malware. In this paper, we propose MLCDroid, an ML-based multi-label classification approach that can directly indicate the existence of pre-defined malicious behaviors. With an in-depth analysis, we summarize six basic malicious behaviors from real-world malware with security reports and construct a labeled dataset. We compare the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qqj1130247885/MLC-for-Android-Malware
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.