ProGQL: A Provenance Graph Query System for Cyber Attack Investigation

Fei Shao; Jia Zou; Zhichao Cao; and Xusheng Xiao

arXiv:2510.22400·cs.CR·October 31, 2025

ProGQL: A Provenance Graph Query System for Cyber Attack Investigation

Fei Shao, Jia Zou, Zhichao Cao, and Xusheng Xiao

PDF

TL;DR

ProGQL introduces a domain-specific graph query system that enhances cyber attack investigation by enabling flexible, scalable, and memory-efficient provenance analysis through advanced graph search capabilities.

Contribution

It presents a novel graph query language and engine tailored for provenance analysis, addressing inflexibility and memory inefficiency in existing techniques.

Findings

01

ProGQL effectively expresses complex attack scenarios compared to Cypher.

02

The framework significantly improves scalability over existing PA techniques.

03

ProGQL reduces memory overhead by supporting incremental graph search.

Abstract

Provenance analysis (PA) has recently emerged as an important solution for cyber attack investigation. PA leverages system monitoring to monitor system activities as a series of system audit events and organizes these events as a provenance graph to show the dependencies among system activities, which can reveal steps of cyber attacks. Despite their potential, existing PA techniques face two critical challenges: (1) they are inflexible and non-extensible, making it difficult to incorporate analyst expertise, and (2) they are memory inefficient, often requiring>100GB of RAM to hold entire event streams, which fundamentally limits scalability and deployment in real-world environments. To address these limitations, we propose the ProGQL framework, which provides a domain-specific graph search language with a well-engineered query engine, allowing PA over system audit events and expert…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.