Capturing the symptoms of malicious code in electronic documents by   file's entropy signal combined with Machine learning

Luping Liu; Xiaohai He; Liang Liu; Lingbo Qing; Yong Fang; Jiayong Liu

arXiv:1903.10208·cs.CY·March 26, 2019

Capturing the symptoms of malicious code in electronic documents by file's entropy signal combined with Machine learning

Luping Liu, Xiaohai He, Liang Liu, Lingbo Qing, Yong Fang, Jiayong Liu

PDF

Open Access

TL;DR

This paper introduces ESRMD, a machine learning framework that detects malicious documents by analyzing entropy signals, effectively handling various formats and obfuscation techniques with high accuracy.

Contribution

The study proposes a novel entropy-based feature extraction method for malware detection in documents, improving robustness against unknown attacks and format diversity.

Findings

01

Achieved 96% true positive rate in detection

02

Outperformed existing antivirus engines and tools

03

Demonstrated robustness against obfuscation techniques

Abstract

Abstract-Email cyber-attacks based on malicious documents have become the popular techniques in today's sophisticated attacks. In the past, persistent efforts have been made to detect such attacks. But there are still some common defects in the existing methods including unable to capture unknown attacks, high overhead of resource and time, and just can be used to detect specific formats of documents. In this study, a new Framework named ESRMD (Entropy signal Reflects the Malicious document) is proposed, which can detect malicious document based on the entropy distribution of the file. In essence, ESRMD is a machine learning classifier. What makes it distinctive is that it extracts global and structural entropy features from the entropy of the malicious documents rather than the structural data or metadata of the file, enduing it the ability to deal with various document formats and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Spam and Phishing Detection