Archetypal Analysis++: Rethinking the Initialization Strategy
Sebastian Mair, Jens Sj\"olund

TL;DR
This paper introduces AA++, a probabilistic initialization method for archetypal analysis inspired by k-means++, which improves the quality of solutions by reducing the likelihood of poor local minima, validated through extensive experiments.
Contribution
We propose AA++, a novel probabilistic initialization strategy for archetypal analysis that enhances solution quality and outperforms existing methods in diverse real-world datasets.
Findings
AA++ consistently outperforms baseline initialization methods.
The method is effective across various data sizes and dimensions.
Monte Carlo approximation makes AA++ computationally efficient.
Abstract
Archetypal analysis is a matrix factorization method with convexity constraints. Due to local minima, a good initialization is essential, but frequently used initialization methods yield either sub-optimal starting points or are prone to get stuck in poor local minima. In this paper, we propose archetypal analysis++ (AA++), a probabilistic initialization strategy for archetypal analysis that sequentially samples points based on their influence on the objective function, similar to -means++. In fact, we argue that -means++ already approximates the proposed initialization method. Furthermore, we suggest to adapt an efficient Monte Carlo approximation of -means++ to AA++. In an extensive empirical evaluation of 15 real-world data sets of varying sizes and dimensionalities and considering two pre-processing strategies, we show that AA++ almost always outperforms all baselines,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Graph Theory and Algorithms · Data Visualization and Analytics
