Sandbox Sample Classification Using Behavioral Indicators of Compromise

M. Andrecut

arXiv:2201.07359·cs.CR·January 20, 2022·1 cites

Sandbox Sample Classification Using Behavioral Indicators of Compromise

M. Andrecut

PDF

Open Access

TL;DR

This paper presents a machine learning approach to classify sandbox samples as malicious or benign based on behavioral indicators of compromise, utilizing traditional and Monte Carlo-inspired methods with real-world data.

Contribution

It introduces a novel classification approach combining traditional ML methods with Monte Carlo-inspired techniques for analyzing sandbox behavioral data.

Findings

01

Effective classification of sandbox samples achieved

02

Monte Carlo-inspired method shows promising results

03

Validated on ThreatGRID and ReversingLabs datasets

Abstract

Behavioral Indicators of Compromise are associated with various automated methods used to extract the sample behavior by observing the system function calls performed in a virtual execution environment. Thus, every sample is described by a set of BICs triggered by the sample behavior in the sandbox environment. Here we discuss a Machine Learning approach to the classification of the sandbox samples as MALICIOUS or BENIGN, based on the list of triggered BICs. Besides the more traditional methods like Logistic Regression and Naive Bayes Classification we also discuss a different approach inspired by the statistical Monte Carlo methods. The numerical results are illustrated using ThreatGRID and ReversingLabs data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management

MethodsLogistic Regression