High-dimensional variable clustering based on maxima of a weakly dependent random process
Alexis Boulin, Elena Di Bernardino, Thomas Lalo\"e, Gwladys Toulemonde

TL;DR
This paper introduces a novel clustering model based on the independence of maxima in multivariate processes, providing an algorithm with theoretical guarantees and applications in neuroscience and environmental data.
Contribution
The paper develops the AI-block model for variable clustering, offering a new approach that does not require pre-specifying the number of clusters and includes a consistent, polynomial-time algorithm.
Findings
The algorithm effectively recovers clusters in high-dimensional data.
The model is identifiable and statistically inferable.
Applications demonstrate versatility in real-world datasets.
Abstract
We propose a new class of models for variable clustering called Asymptotic Independent block (AI-block) models, which defines population-level clusters based on the independence of the maxima of a multivariate stationary mixing random process among clusters. This class of models is identifiable, meaning that there exists a maximal element with a partial order between partitions, allowing for statistical inference. We also present an algorithm depending on a tuning parameter that recovers the clusters of variables without specifying the number of clusters \emph{a priori}. Our work provides some theoretical insights into the consistency of our algorithm, demonstrating that under certain conditions it can effectively identify clusters in the data with a computational complexity that is polynomial in the dimension. A data-driven selection method for the tuning parameter is also proposed. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Markov Chains and Monte Carlo Methods
