Transfer Learning for Image-Based Malware Classification

Niket Bhodia; Pratikkumar Prajapati; Fabio Di Troia; Mark Stamp

arXiv:1903.11551·cs.LG·March 28, 2019·19 cites

Transfer Learning for Image-Based Malware Classification

Niket Bhodia, Pratikkumar Prajapati, Fabio Di Troia, Mark Stamp

PDF

Open Access 1 Repo

TL;DR

This paper explores using transfer learning with deep learning models on image representations of executables for malware classification, comparing it to simple k-NN methods, and analyzing their generalization capabilities.

Contribution

It introduces a transfer learning approach for image-based malware classification and compares its effectiveness to traditional feature-based k-NN methods.

Findings

01

Deep learning models perform well but are outperformed by k-NN in standard tests.

02

DL models generalize better and outperform k-NN in zero-day malware detection.

03

Simple k-NN with features from executables can be surprisingly effective.

Abstract

In this paper, we consider the problem of malware detection and classification based on image analysis. We convert executable files to images and apply image recognition using deep learning (DL) models. To train these models, we employ transfer learning based on existing DL models that have been pre-trained on massive image datasets. We carry out various experiments with this technique and compare its performance to that of an extremely simple machine learning technique, namely, k-nearest neighbors (\kNN). For our k-NN experiments, we use features extracted directly from executables, rather than image analysis. While our image-based DL technique performs well in the experiments, surprisingly, it is outperformed by k-NN. We show that DL models are better able to generalize the data, in the sense that they outperform k-NN in simulated zero-day experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pratikpv/malware_classification
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Anomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning

Methodsk-Nearest Neighbors