Distributed Graph Clustering using Modularity and Map Equation

Michael Hamann; Ben Strasser; Dorothea Wagner; Tim Zeitz

arXiv:1710.09605·cs.DS·April 28, 2020

Distributed Graph Clustering using Modularity and Map Equation

Michael Hamann, Ben Strasser, Dorothea Wagner, Tim Zeitz

PDF

TL;DR

This paper introduces two distributed algorithms, DSLM-Mod and DSLM-Map, for large-scale graph clustering based on modularity and map equation, demonstrating superior speed and quality on extensive real-world and synthetic graphs.

Contribution

The paper presents novel distributed algorithms for graph clustering optimizing modularity and map equation, scalable to billion-edge graphs, with improved speed and memory efficiency.

Findings

01

Algorithms are fast and produce high-quality clusters.

02

Compared to GossipMap, our methods use less memory and are up to ten times faster.

03

Effective on graphs with up to 68 billion edges.

Abstract

We study large-scale, distributed graph clustering. Given an undirected graph, our objective is to partition the nodes into disjoint sets called clusters. A cluster should contain many internal edges while being sparsely connected to other clusters. In the context of a social network, a cluster could be a group of friends. Modularity and map equation are established formalizations of this internally-dense-externally-sparse principle. We present two versions of a simple distributed algorithm to optimize both measures. They are based on Thrill, a distributed big data processing framework that implements an extended MapReduce model. The algorithms for the two measures, DSLM-Mod and DSLM-Map, differ only slightly. Adapting them for similar quality measures is straight-forward. We conduct an extensive experimental study on real-world graphs and on synthetic benchmark graphs with up to 68…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.