Learning the PE Header, Malware Detection with Minimal Domain Knowledge

Edward Raff; Jared Sylvester; Charles Nicholas

arXiv:1709.01471·stat.ML·November 15, 2017

Learning the PE Header, Malware Detection with Minimal Domain Knowledge

Edward Raff, Jared Sylvester, Charles Nicholas

PDF

2 Repos

TL;DR

This paper demonstrates that neural networks can effectively detect malware using minimal domain knowledge by learning from raw bytes and PE header features, outperforming traditional explicit feature extraction methods.

Contribution

It introduces a minimal domain knowledge approach for malware detection using neural networks to learn from raw bytes and PE headers, showing improved performance over explicit feature parsing.

Findings

01

Neural networks outperform explicit PE header feature methods.

02

Learning from raw bytes is feasible with minimal domain knowledge.

03

Minimal domain knowledge suffices for effective malware detection.

Abstract

Many efforts have been made to use various forms of domain knowledge in malware detection. Currently there exist two common approaches to malware detection without domain knowledge, namely byte n-grams and strings. In this work we explore the feasibility of applying neural networks to malware detection and feature learning. We do this by restricting ourselves to a minimal amount of domain knowledge in order to extract a portion of the Portable Executable (PE) header. By doing this we show that neural networks can learn from raw bytes without explicit feature construction, and perform even better than a domain knowledge approach that parses the PE header into explicit features.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.