ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations
Maciej Besta, Cesare Miglioli, Paolo Sylos Labini, Jakub T\v{e}tek,, Patrick Iff, Raghavendra Kanakagiri, Saleh Ashkboos, Kacper Janda, Michal, Podstawski, Grzegorz Kwasniewski, Niels Gleinig, Flavio Vella, Onur Mutlu,, Torsten Hoefler

TL;DR
ProbGraph introduces probabilistic set representations like Bloom filters to enable fast, approximate parallel graph mining with strong theoretical guarantees, significantly improving performance while maintaining high accuracy.
Contribution
It presents a novel graph representation method using probabilistic sets that accelerates parallel graph mining algorithms with theoretical guarantees on efficiency and accuracy.
Findings
Up to 50x speedup over exact baselines on 32 cores
Achieves over 90% accuracy on many datasets
Provides new bounds and algorithms for probabilistic set representations
Abstract
Important graph mining problems such as Clustering are computationally demanding. To significantly accelerate these problems, we propose ProbGraph: a graph representation that enables simple and fast approximate parallel graph mining with strong theoretical guarantees on work, depth, and result accuracy. The key idea is to represent sets of vertices using probabilistic set representations such as Bloom filters. These representations are much faster to process than the original vertex sets thanks to vectorizability and small size. We use these representations as building blocks in important parallel graph mining algorithms such as Clique Counting or Clustering. When enhanced with ProbGraph, these algorithms significantly outperform tuned parallel exact baselines (up to nearly 50x on 32 cores) while ensuring accuracy of more than 90% for many input graph datasets. Our novel bounds and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Advanced Graph Neural Networks · Data Mining Algorithms and Applications
