# Network-based Analysis and Classification of Malware using Behavioral   Artifacts Ordering

**Authors:** Aziz Mohaisen, Omar Alrawi, Jeman Park, Joongheon Kim, DaeHun Nyang,, Manar Mohaisen

arXiv: 1901.01185 · 2019-01-07

## TL;DR

This paper introduces Chatter, a malware classification system that uses the sequence order of high-level system events, achieving high accuracy with less computational cost than traditional fine-grained methods.

## Contribution

The paper presents a novel approach using event ordering and n-gram classification for malware family prediction, reducing complexity and circumventing signature-based evasion.

## Key findings

- Achieves 83%-94% accuracy with network events alone.
- Improves baseline classifier accuracy to 98.8%.
- Demonstrates effectiveness of sequence-based features in malware classification.

## Abstract

Using runtime execution artifacts to identify malware and its associated family is an established technique in the security domain. Many papers in the literature rely on explicit features derived from network, file system, or registry interaction. While effective, the use of these fine-granularity data points makes these techniques computationally expensive. Moreover, the signatures and heuristics are often circumvented by subsequent malware authors. In this work, we propose Chatter, a system that is concerned only with the order in which high-level system events take place. Individual events are mapped onto an alphabet and execution traces are captured via terse concatenations of those letters. Then, leveraging an analyst labeled corpus of malware, n-gram document classification techniques are applied to produce a classifier predicting malware family. This paper describes that technique and its proof-of-concept evaluation. In its prototype form, only network events are considered and eleven malware families are used. We show the technique achieves 83%-94% accuracy in isolation and makes non-trivial performance improvements when integrated with a baseline classifier of combined order features to reach an accuracy of up to 98.8%.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.01185/full.md

## Figures

18 figures with captions in the complete paper: https://tomesphere.com/paper/1901.01185/full.md

## References

84 references — full list in the complete paper: https://tomesphere.com/paper/1901.01185/full.md

---
Source: https://tomesphere.com/paper/1901.01185