COREclust: a new package for a robust and scalable analysis of complex data
Camille Champion (IMT), Anne-Claire Brunet (IMT), Jean-Michel Loubes, (IMT), Laurent Risser (IMT)

TL;DR
COREclust is an R package that uses a novel graph clustering algorithm to identify representative variable sets in high-dimensional data, providing robust estimates even with limited observations.
Contribution
The paper introduces COREclust, a new scalable R package with a graph clustering algorithm for robust variable selection in high-dimensional, small-sample datasets.
Findings
Effective detection of variable sets in synthetic data.
Application to real datasets demonstrates robustness.
Algorithm maintains reasonable computational cost.
Abstract
In this paper, we present a new R package COREclust dedicated to the detection of representative variables in high dimensional spaces with a potentially limited number of observations. Variable sets detection is based on an original graph clustering strategy denoted CORE-clustering algorithm that detects CORE-clusters, i.e. variable sets having a user defined size range and in which each variable is very similar to at least another variable. Representative variables are then robustely estimate as the CORE-cluster centers. This strategy is entirely coded in C++ and wrapped by R using the Rcpp package. A particular effort has been dedicated to keep its algorithmic cost reasonable so that it can be used on large datasets. After motivating our work, we will explain the CORE-clustering algorithm as well as a greedy extension of this algorithm. We will then present how to use it and results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Advanced Clustering Algorithms Research · Neural Networks and Applications
