Probabilistic methods for approximate archetypal analysis

Ruijian Han; Braxton Osting; Dong Wang; Yiming Xu

arXiv:2108.05767·stat.CO·May 13, 2022

Probabilistic methods for approximate archetypal analysis

Ruijian Han, Braxton Osting, Dong Wang, Yiming Xu

PDF

TL;DR

This paper introduces a probabilistic approximation method for archetypal analysis that reduces computational complexity by dimensionality reduction and representation simplification, enabling efficient analysis of large datasets.

Contribution

The paper presents a novel probabilistic approach with preprocessing techniques that improve the scalability of archetypal analysis under certain geometric conditions.

Findings

01

Effective dimension and representation reduction for large datasets

02

Near-optimal solutions with reduced computational cost

03

Applicability demonstrated on real-world datasets

Abstract

Archetypal analysis is an unsupervised learning method for exploratory data analysis. One major challenge that limits the applicability of archetypal analysis in practice is the inherent computational complexity of the existing algorithms. In this paper, we provide a novel approximation approach to partially address this issue. Utilizing probabilistic ideas from high-dimensional geometry, we introduce two preprocessing techniques to reduce the dimension and representation cardinality of the data, respectively. We prove that provided the data is approximately embedded in a low-dimensional linear subspace and the convex hull of the corresponding representations is well approximated by a polytope with a few vertices, our method can effectively reduce the scaling of archetypal analysis. Moreover, the solution of the reduced problem is near-optimal in terms of prediction errors. Our approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.