Identifying bias in cluster quality metrics

Mart\'i Renedo-Mirambell; Argimiro Arratia

arXiv:2112.06287·physics.soc-ph·July 8, 2025

Identifying bias in cluster quality metrics

Mart\'i Renedo-Mirambell, Argimiro Arratia

PDF

Open Access

TL;DR

This paper investigates biases in popular cluster quality metrics, revealing that most favor fewer larger clusters, and introduces a new metric called density ratio to address these biases.

Contribution

The study analyzes biases in existing metrics using synthetic network models and proposes the density ratio as a less biased alternative.

Findings

01

Most metrics favor fewer larger clusters.

02

Modularity and density ratio are less biased.

03

Synthetic models effectively reveal metric biases.

Abstract

We study potential biases of popular cluster quality metrics, such as conductance or modularity. We propose a method that uses both stochastic and preferential attachment block models construction to generate networks with preset community structures, to which quality metrics will be applied. These models also allow us to generate multi-level structures of varying strength, which will show if metrics favour partitions into a larger or smaller number of clusters. Additionally, we propose another quality metric, the density ratio. We observed that most of the studied metrics tend to favour partitions into a smaller number of big clusters, even when their relative internal and external connectivity are the same. The metrics found to be less biased are modularity and density ratio.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Network Analysis Techniques · Functional Brain Connectivity Studies