TL;DR
This study evaluates 25 graph measures for node clustering across a comprehensive set of LFR-generated graphs, identifying which measures perform best in different parameter zones to guide measure selection.
Contribution
It systematically analyzes measure performance over the entire LFR parameter space, providing a zone-based recommendation for optimal graph measures in clustering tasks.
Findings
Distinct zones where specific measures outperform others
Geometry of zones described with simple criteria
Guidelines for measure selection based on graph parameters
Abstract
Graph measures that express closeness or distance between nodes can be employed for graph nodes clustering using metric clustering algorithms. There are numerous measures applicable to this task, and which one performs better is an open question. We study the performance of 25 graph measures on generated graphs with different parameters. While usually measure comparisons are limited to general measure ranking on a particular dataset, we aim to explore the performance of various measures depending on graph features. Using an LFR graph generator, we create a dataset of 11780 graphs covering the whole LFR parameter space. For each graph, we assess the quality of clustering with k-means algorithm for each considered measure. Based on this, we determine the best measure for each area of the parameter space. We find that the parameter space consists of distinct zones where one particular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
