# Andro-Simnet: Android Malware Family Classification Using Social Network   Analysis

**Authors:** Hye Min Kim, Hyun Min Song, Jae Woo Seo, Huy Kang Kim

arXiv: 1906.09456 · 2019-06-25

## TL;DR

This paper introduces Andro-Simnet, a behavior-based Android malware classification system using social network analysis, achieving high accuracy by measuring similarities and applying community detection to malware behavior patterns.

## Contribution

It presents a novel similarity measure and feature weighting method for malware classification, incorporating social network analysis and community detection to improve accuracy.

## Key findings

- 97% classification accuracy on real malware dataset
- 95% prediction accuracy with K-fold cross-validation
- Effective visualization of malware networks

## Abstract

While the rapid adaptation of mobile devices changes our daily life more conveniently, the threat derived from malware is also increased. There are lots of research to detect malware to protect mobile devices, but most of them adopt only signature-based malware detection method that can be easily bypassed by polymorphic and metamorphic malware. To detect malware and its variants, it is essential to adopt behavior-based detection for efficient malware classification. This paper presents a system that classifies malware by using common behavioral characteristics along with malware families. We measure the similarity between malware families with carefully chosen features commonly appeared in the same family. With the proposed similarity measure, we can classify malware by malware's attack behavior pattern and tactical characteristics. Also, we apply a community detection algorithm to increase the modularity within each malware family network aggregation. To maintain high classification accuracy, we propose a process to derive the optimal weights of the selected features in the proposed similarity measure. During this process, we find out which features are significant for representing the similarity between malware samples. Finally, we provide an intuitive graph visualization of malware samples which is helpful to understand the distribution and likeness of the malware networks. In the experiment, the proposed system achieved 97% accuracy for malware classification and 95% accuracy for prediction by K-fold cross-validation using the real malware dataset.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.09456/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1906.09456/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/1906.09456/full.md

---
Source: https://tomesphere.com/paper/1906.09456