Scaling Analysis of Affinity Propagation
Cyril Furtlehner, Michele Sebag, Xiangliang Zhang

TL;DR
This paper investigates the scaling properties of the Affinity Propagation clustering algorithm, demonstrating how hierarchical strategies can reduce complexity and identifying a critical parameter value that reveals the number of clusters in data.
Contribution
It introduces a hierarchical divide-and-conquer approach to improve AP's scalability and uncovers a phase transition in cluster structure related to the penalty parameter.
Findings
Hierarchical strategy reduces AP complexity from O(N^2) to O(N^{(h+2)/(h+1)})
In high dimensions, the precision loss is minimal except in 2D
A critical penalty value s* separates fragmentation and coalescence phases, aiding cluster count estimation
Abstract
We analyze and exploit some scaling properties of the Affinity Propagation (AP) clustering algorithm proposed by Frey and Dueck (2007). First we observe that a divide and conquer strategy, used on a large data set hierarchically reduces the complexity to , for a data-set of size and a depth of the hierarchical strategy. For a data-set embedded in a -dimensional space, we show that this is obtained without notably damaging the precision except in dimension . In fact, for larger than 2 the relative loss in precision scales like . Finally, under some conditions we observe that there is a value of the penalty coefficient, a free parameter used to fix the number of clusters, which separates a fragmentation phase (for ) from a coalescent one (for ) of the underlying hidden cluster structure.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
