Two-stage Ensemble Clustering of Functional Data Using Random Projections
Sourav Chakrabarty, Anirvan Chakraborty, Shyamal K. De

TL;DR
This paper introduces a two-stage ensemble clustering method for functional data using Gaussian process-based random projections, improving accuracy over existing techniques.
Contribution
It presents a novel two-stage clustering framework that combines random projections and data-driven directions for functional data analysis.
Findings
Achieves high clustering accuracy in simulations and real data.
Outperforms many state-of-the-art functional data clustering methods.
Applicable to irregular and partially observed functional data.
Abstract
We propose a computationally simple framework for clustering functional data based on Gaussian-process-generated random projections. In this approach, each curve is first projected onto a large collection of independent Gaussian process realizations. The resulting high-dimensional representations are clustered using the Mean Absolute Difference of Distances (MADD), a dissimilarity measure well suited for high-dimensional settings. A population-level analysis of this dissimilarity provides insight into how random projections help capture distributional differences between functional populations. We introduce a second stage of clustering to additionally leverage on data-driven projection directions. Thus, in Stage I, an initial clustering is obtained using a set of prespecified projection families. In Stage II, this partition is refined by constructing Gaussian random projections based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
