Evaluation of Cluster Id Assignment Schemes with ABCDE
Stephan van Staden

TL;DR
This paper introduces a scalable evaluation method, ABCDE, for assessing the stability and quality of cluster id assignment schemes, especially in large-scale, real-world clustering applications.
Contribution
It presents a novel, scalable framework for evaluating cluster id assignment schemes using ABCDE, including generalizations for complex cluster changes.
Findings
ABCDE effectively measures id assignment differences in large datasets
The framework supports evaluation of id stability over time
It accommodates complex cluster membership mutations
Abstract
A cluster id assignment scheme labels each cluster of a clustering with a distinct id. The goal of id assignment is semantic id stability, which means that, whenever possible, a cluster for the same underlying concept as that of a historical cluster should ideally receive the same id as the historical cluster. Semantic id stability allows the users of a clustering to refer to a concept's cluster with an id that is stable across clusterings/time. This paper treats the problem of evaluating the relative merits of id assignment schemes. In particular, it considers a historical clustering with id assignments, and a new clustering with ids assigned by a baseline and an experiment. It produces metrics that characterize both the magnitude and the quality of the id assignment diffs between the baseline and the experiment. That happens by transforming the problem of cluster id assignment into a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTransportation and Mobility Innovations · Auction Theory and Applications · Supply Chain and Inventory Management
