Connected Components for Infinite Graph Streams: Theory and Practice
Jonathan W. Berry, Cynthia A Phillips, Alexandra M. Porter

TL;DR
This paper introduces XStream, a novel graph streaming model designed for continuous cybersecurity data, enabling efficient maintenance of connected components with bulk deletions, supported by theoretical analysis and a prototype implementation.
Contribution
XStream is the first model tailored for unending cybersecurity graph streams with bulk deletions, providing algorithms and theoretical bounds for connected components maintenance.
Findings
XStream handles 1-5 million edges per second on Intel Sky Lake processors.
Theoretical relationships among query downtime, edge aging, duplication, and bandwidth are established.
Prototype implementation achieves up to one million edges per second, with potential for significant speed improvements.
Abstract
Motivated by the properties of unending real-world cybersecurity streams, we present a new graph streaming model: XStream. We maintain a streaming graph and its connected components at single-edge granularity. In cybersecurity graph applications, input streams typically consist of edge insertions; individual deletions are not explicit. Analysts maintain as much history as possible and will trigger customized bulk deletions when necessary Despite a variety of dynamic graph processing systems and some canonical literature on theoretical sliding-window graph streaming, XStream is the first model explicitly designed to accommodate this usage model. Users can provide Boolean predicates to define bulk deletions. Edge arrivals are expected to occur continuously and must always be handled. XStream is implemented via a ring of finite-memory processors. We give algorithms to maintain connected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Distributed systems and fault tolerance · Cloud Computing and Resource Management
