GraphWeaver: Billion-Scale Cybersecurity Incident Correlation
Scott Freitas, Amir Gharib

TL;DR
GraphWeaver is a scalable, geo-distributed graph-based framework that improves cybersecurity incident correlation accuracy and efficiency at an industry scale, integrating domain knowledge and human feedback.
Contribution
It introduces a novel large-scale, geo-distributed graph approach with innovative algorithms and human-in-the-loop feedback for cybersecurity incident correlation.
Findings
Handles billions of alerts with 99% accuracy
Reduces correlation storage by 7.4 times
Integrated into Microsoft Defender XDR worldwide
Abstract
In the dynamic landscape of large enterprise cybersecurity, accurately and efficiently correlating billions of security alerts into comprehensive incidents is a substantial challenge. Traditional correlation techniques often struggle with maintenance, scaling, and adapting to emerging threats and novel sources of telemetry. We introduce GraphWeaver, an industry-scale framework that shifts the traditional incident correlation process to a data-optimized, geo-distributed graph based approach. GraphWeaver introduces a suite of innovations tailored to handle the complexities of correlating billions of shared evidence alerts across hundreds of thousands of enterprises. Key among these innovations are a geo-distributed database and PySpark analytics engine for large-scale data processing, a minimum spanning tree algorithm to optimize correlation storage, integration of security domain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation and Cyber Security · Complex Network Analysis Techniques · Advanced Graph Neural Networks
