A Revenue Function for Comparison-Based Hierarchical Clustering
Aishik Mandal, Micha\"el Perrot, Debarghya Ghoshdastidar

TL;DR
This paper introduces a new revenue function for evaluating and recovering hierarchical clusterings solely from comparison data, bridging the gap between comparison-based methods and traditional similarity-based approaches.
Contribution
It proposes a novel revenue function for assessing dendrograms using only comparisons and provides algorithms to optimize this function for hierarchical clustering.
Findings
The revenue function closely relates to Dasgupta's cost for similarity-based clustering.
Theoretical results show approximate recovery of latent hierarchies from few triplet comparisons.
Empirical comparisons demonstrate the effectiveness of the proposed algorithms.
Abstract
Comparison-based learning addresses the problem of learning when, instead of explicit features or pairwise similarities, one only has access to comparisons of the form: \emph{Object is more similar to than to .} Recently, it has been shown that, in Hierarchical Clustering, single and complete linkage can be directly implemented using only such comparisons while several algorithms have been proposed to emulate the behaviour of average linkage. Hence, finding hierarchies (or dendrograms) using only comparisons is a well understood problem. However, evaluating their meaningfulness when no ground-truth nor explicit similarities are available remains an open question. In this paper, we bridge this gap by proposing a new revenue function that allows one to measure the goodness of dendrograms using only comparisons. We show that this function is closely related to Dasgupta's cost…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Face and Expression Recognition · Complex Network Analysis Techniques
