# Can Machine Learning Model with Static Features be Fooled: an   Adversarial Machine Learning Approach

**Authors:** Rahim Taheri, Reza Javidan, Mohammad Shojafar, Vinod P, Mauro Conti

arXiv: 1904.09433 · 2020-03-03

## TL;DR

This paper demonstrates how machine learning models for Android malware detection can be fooled by adversarial attacks and proposes defense mechanisms, showing that detection rates can be significantly improved using GAN-based methods.

## Contribution

It introduces five attack scenarios to generate adversarial malware examples and proposes two defense strategies, including GAN-based hardening, to improve detection robustness.

## Key findings

- Adversarial samples can evade detection with high probability.
- Defense mechanisms can increase detection rate up to 50%.
- GAN-based methods effectively harden malware detection systems.

## Abstract

The widespread adoption of smartphones dramatically increases the risk of attacks and the spread of mobile malware, especially on the Android platform. Machine learning-based solutions have been already used as a tool to supersede signature-based anti-malware systems. However, malware authors leverage features from malicious and legitimate samples to estimate statistical difference in-order to create adversarial examples. Hence, to evaluate the vulnerability of machine learning algorithms in malware detection, we propose five different attack scenarios to perturb malicious applications (apps). By doing this, the classification algorithm inappropriately fits the discriminant function on the set of data points, eventually yielding a higher misclassification rate. Further, to distinguish the adversarial examples from benign samples, we propose two defense mechanisms to counter attacks. To validate our attacks and solutions, we test our model on three different benchmark datasets. We also test our methods using various classifier algorithms and compare them with the state-of-the-art data poisoning method using the Jacobian matrix. Promising results show that generated adversarial samples can evade detection with a very high probability. Additionally, evasive variants generated by our attack models when used to harden the developed anti-malware system improves the detection rate up to 50% when using the Generative Adversarial Network (GAN) method.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.09433/full.md

## Figures

37 figures with captions in the complete paper: https://tomesphere.com/paper/1904.09433/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/1904.09433/full.md

---
Source: https://tomesphere.com/paper/1904.09433