Prov2vec: Learning Provenance Graph Representation for Unsupervised APT Detection
Bibek Bhattarai, H. Howie Huang

TL;DR
Prov2Vec introduces a novel provenance graph kernel for representing system behavior, enabling early detection of advanced persistent threats through machine learning analysis of host activity deviations.
Contribution
It presents a new provenance graph kernel that produces compact, effective representations of host behavior for unsupervised APT detection, improving over existing methods.
Findings
Provenance graph kernel yields more compact representations.
Improved prediction accuracy in detecting APTs.
Effective in capturing host behavior deviations.
Abstract
Modern cyber attackers use advanced zero-day exploits, highly targeted spear phishing, and other social engineering techniques to gain access and also use evasion techniques to maintain a prolonged presence within the victim network while working gradually towards the objective. To minimize the damage, it is necessary to detect these Advanced Persistent Threats as early in the campaign as possible. This paper proposes, Prov2Vec, a system for the continuous monitoring of enterprise host's behavior to detect attackers' activities. It leverages the data provenance graph built using system event logs to get complete visibility into the execution state of an enterprise host and the causal relationship between system entities. It proposes a novel provenance graph kernel to obtain the canonical representation of the system behavior, which is compared against its historical behaviors and that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Network Security and Intrusion Detection · Information and Cyber Security
