ABCDE: Application-Based Cluster Diff Evals
Stephan van Staden, Alexander Grubb

TL;DR
ABCDE introduces a practical, application-specific method for evaluating and comparing large population clusterings by sampling judgements based on actual clustering differences, avoiding the need for pre-defined ground truth.
Contribution
It presents a novel, efficient evaluation technique that leverages pointwise metrics and sampling based on clustering diffs, enhancing understanding and debugging of large-scale clusterings.
Findings
Allows importance-weighted item evaluation.
Reduces human judgement effort by sampling based on actual differences.
Provides metrics for arbitrary item slices for detailed analysis.
Abstract
This paper considers the problem of evaluating clusterings of very large populations of items. Given two clusterings, namely a Baseline clustering and an Experiment clustering, the tasks are twofold: 1) characterize their differences, and 2) determine which clustering is better. ABCDE is a novel evaluation technique for accomplishing that. It aims to be practical: it allows items to have associated importance values that are application-specific, it is frugal in its use of human judgements when determining which clustering is better, and it can report metrics for arbitrary slices of items, thereby facilitating understanding and debugging. The approach to measuring the delta in the clustering quality is novel: instead of trying to construct an expensive ground truth up front and evaluating the each clustering with respect to that, where the ground truth must effectively pre-anticipate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Information and Cyber Security · Cloud Computing and Resource Management
