Learning the APT Kill Chain: Temporal Reasoning over Provenance Data for Attack Stage Estimation
Trung V. Phan, Thomas Bauschert

TL;DR
This paper introduces StageFinder, a graph neural network and LSTM-based framework that accurately estimates APT attack stages from fused provenance data, improving stability and precision over existing methods.
Contribution
The paper presents a novel temporal-graph learning framework combining GNNs and LSTMs for precise APT stage inference from provenance data, outperforming prior approaches.
Findings
Achieves a macro F1-score of 0.96 in stage estimation.
Reduces prediction volatility by 31% compared to baselines.
Effectively fuses host and network provenance data for attack progression inference.
Abstract
Advanced Persistent Threats (APTs) evolve through multiple stages, each exhibiting distinct temporal and structural behaviors. Accurate stage estimation is critical for enabling adaptive cyber defense. This paper presents StageFinder, a temporal-graph learning framework for multi-stage attack progression inference from fused host and network provenance data. Provenance graphs are encoded using a graph neural network to capture structural dependencies among processes, files, and connections, while a long short-term memory (LSTM) model learns temporal dynamics to estimate stage probabilities aligned with the MITRE ATT&CK framework. The model is pretrained on the DARPA OpTC dataset and fine-tuned on labeled DARPA Transparent Computing data. Experimental results demonstrate that StageFinder achieves a macro F1-score of 0.96 and reduces prediction volatility by 31% compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
