Comparing Two Partitions of Non-Equal Sets of Units
Marjan Cugmas, Anu\v{s}ka Ferligoj

TL;DR
This paper introduces modified and asymmetric versions of the Rand and Wallace indices for comparing partitions of units, especially when sets differ or when cluster stability involves splitting and merging, with chance corrections included.
Contribution
It proposes new modified and asymmetric indices for comparing partitions of different or non-equal sets, addressing limitations of existing symmetric indices.
Findings
Modified Rand index accounts for non-equal sets of units.
Asymmetric Wallace index distinguishes splitting and merging effects.
Chance correction methods are provided for all indices.
Abstract
Rand (1971) proposed what has since become a well-known index for comparing two partitions obtained on the same set of units. The index takes a value on the interval between 0 and 1, where a higher value indicates more similar partitions. Sometimes, e.g. when the units are observed in two time periods, the splitting and merging of clusters should be considered differently, according to the operationalization of the stability of clusters. The Rand Index is symmetric in the sense that both the splitting and merging of clusters lower the value of the index. In such a non-symmetric case, one of the Wallace indexes (Wallace, 1983) can be used. Further, there are several cases when one wants to compare two partitions obtained on different sets of units, where the intersection of these sets of units is a non-empty set of units. In this instance, the new units and units which leave the clusters…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Time Series Analysis · Complex Network Analysis Techniques
