TL;DR
This paper introduces PMIV, a novel graph-based vectorization method using PageRank measures for classifying .NET files as benign or malicious, demonstrating high efficiency and effectiveness in malware detection.
Contribution
The paper presents a new graph integration technique called PMIV that improves malware classification by leveraging probabilistic graph representations and vectorization.
Findings
High classification accuracy on 2.5 million samples
Median decompilation and scoring time of 24ms
Outperforms text-only feature baseline
Abstract
We classify .NET files as either benign or malicious by examining directed graphs derived from the set of functions comprising the given file. Each graph is viewed probabilistically as a Markov chain where each node represents a code block of the corresponding function, and by computing the PageRank vector (Perron vector with transport), a probability measure can be defined over the nodes of the given graph. Each graph is vectorized by computing Lebesgue antiderivatives of hand-engineered functions defined on the vertex set of the given graph against the PageRank measure. Files are subsequently vectorized by aggregating the set of vectors corresponding to the set of graphs resulting from decompiling the given file. The result is a fast, intuitive, and easy-to-compute glass-box vectorization scheme, which can be leveraged for training a standalone classifier or to augment an existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
