HAPSSA: Holistic Approach to PDF Malware Detection Using Signal and Statistical Analysis
Tajuddin Manhar Mohammed, Lakshmanan Nataraj, Satish Chikkagoudar,, Shivkumar Chandrasekaran, B.S. Manjunath

TL;DR
This paper presents a holistic PDF malware detection method combining signal and statistical analysis, achieving high accuracy and robustness against obfuscation, outperforming traditional ML-based approaches.
Contribution
The paper introduces a novel holistic detection approach that integrates static and dynamic features for improved robustness against malware obfuscation.
Findings
Achieves 99.92% detection rate on a large dataset
Detects malware obfuscation techniques undetected by antiviruses
Outperforms existing machine learning methods in robustness
Abstract
Malicious PDF documents present a serious threat to various security organizations that require modern threat intelligence platforms to effectively analyze and characterize the identity and behavior of PDF malware. State-of-the-art approaches use machine learning (ML) to learn features that characterize PDF malware. However, ML models are often susceptible to evasion attacks, in which an adversary obfuscates the malware code to avoid being detected by an Antivirus. In this paper, we derive a simple yet effective holistic approach to PDF malware detection that leverages signal and statistical analysis of malware binaries. This includes combining orthogonal feature space models from various static and dynamic malware detection methods to enable generalized robustness when faced with code obfuscations. Using a dataset of nearly 30,000 PDF files containing both malware and benign samples,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Digital and Cyber Forensics
