ProGQL: A Provenance Graph Query System for Cyber Attack Investigation
Fei Shao, Jia Zou, Zhichao Cao, and Xusheng Xiao

TL;DR
ProGQL introduces a domain-specific graph query system that enhances cyber attack investigation by enabling flexible, scalable, and memory-efficient provenance analysis through advanced graph search capabilities.
Contribution
It presents a novel graph query language and engine tailored for provenance analysis, addressing inflexibility and memory inefficiency in existing techniques.
Findings
ProGQL effectively expresses complex attack scenarios compared to Cypher.
The framework significantly improves scalability over existing PA techniques.
ProGQL reduces memory overhead by supporting incremental graph search.
Abstract
Provenance analysis (PA) has recently emerged as an important solution for cyber attack investigation. PA leverages system monitoring to monitor system activities as a series of system audit events and organizes these events as a provenance graph to show the dependencies among system activities, which can reveal steps of cyber attacks. Despite their potential, existing PA techniques face two critical challenges: (1) they are inflexible and non-extensible, making it difficult to incorporate analyst expertise, and (2) they are memory inefficient, often requiring>100GB of RAM to hold entire event streams, which fundamentally limits scalability and deployment in real-world environments. To address these limitations, we propose the ProGQL framework, which provides a domain-specific graph search language with a well-engineered query engine, allowing PA over system audit events and expert…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
