Scaling Analysis of Affinity Propagation

Cyril Furtlehner; Michele Sebag; Xiangliang Zhang

arXiv:0910.1800·cs.AI·May 29, 2013

Scaling Analysis of Affinity Propagation

Cyril Furtlehner, Michele Sebag, Xiangliang Zhang

PDF

TL;DR

This paper investigates the scaling properties of the Affinity Propagation clustering algorithm, demonstrating how hierarchical strategies can reduce complexity and identifying a critical parameter value that reveals the number of clusters in data.

Contribution

It introduces a hierarchical divide-and-conquer approach to improve AP's scalability and uncovers a phase transition in cluster structure related to the penalty parameter.

Findings

01

Hierarchical strategy reduces AP complexity from O(N^2) to O(N^{(h+2)/(h+1)})

02

In high dimensions, the precision loss is minimal except in 2D

03

A critical penalty value s* separates fragmentation and coalescence phases, aiding cluster count estimation

Abstract

We analyze and exploit some scaling properties of the Affinity Propagation (AP) clustering algorithm proposed by Frey and Dueck (2007). First we observe that a divide and conquer strategy, used on a large data set hierarchically reduces the complexity $O (N^{2})$ to $O (N^{(h + 2) / (h + 1)})$ , for a data-set of size $N$ and a depth $h$ of the hierarchical strategy. For a data-set embedded in a $d$ -dimensional space, we show that this is obtained without notably damaging the precision except in dimension $d = 2$ . In fact, for $d$ larger than 2 the relative loss in precision scales like $N^{(2 - d) / (h + 1) d}$ . Finally, under some conditions we observe that there is a value $s^{*}$ of the penalty coefficient, a free parameter used to fix the number of clusters, which separates a fragmentation phase (for $s < s^{*}$ ) from a coalescent one (for $s > s^{*}$ ) of the underlying hidden cluster structure.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.