Capturing the symptoms of malicious code in electronic documents by file's entropy signal combined with Machine learning
Luping Liu, Xiaohai He, Liang Liu, Lingbo Qing, Yong Fang, Jiayong Liu

TL;DR
This paper introduces ESRMD, a machine learning framework that detects malicious documents by analyzing entropy signals, effectively handling various formats and obfuscation techniques with high accuracy.
Contribution
The study proposes a novel entropy-based feature extraction method for malware detection in documents, improving robustness against unknown attacks and format diversity.
Findings
Achieved 96% true positive rate in detection
Outperformed existing antivirus engines and tools
Demonstrated robustness against obfuscation techniques
Abstract
Abstract-Email cyber-attacks based on malicious documents have become the popular techniques in today's sophisticated attacks. In the past, persistent efforts have been made to detect such attacks. But there are still some common defects in the existing methods including unable to capture unknown attacks, high overhead of resource and time, and just can be used to detect specific formats of documents. In this study, a new Framework named ESRMD (Entropy signal Reflects the Malicious document) is proposed, which can detect malicious document based on the entropy distribution of the file. In essence, ESRMD is a machine learning classifier. What makes it distinctive is that it extracts global and structural entropy features from the entropy of the malicious documents rather than the structural data or metadata of the file, enduing it the ability to deal with various document formats and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Spam and Phishing Detection
