Sparse and geometry-aware generalisation of the mutual information for joint discriminative clustering and feature selection
Louis Ohl, Pierre-Alexandre Mattei, Charles Bouveyron, Micka\"el, Leclercq, Arnaud Droit, Fr\'ed\'eric Precioso

TL;DR
Sparse GEMINI is a scalable, discriminative clustering method that maximizes a geometry-aware mutual information measure, effectively selecting relevant features without prior assumptions, demonstrated on synthetic and large-scale datasets.
Contribution
Introduces Sparse GEMINI, a novel scalable clustering model that maximizes a geometry-aware mutual information for effective feature selection without prior hypotheses.
Findings
Competitive performance on synthetic datasets
Effective variable selection without relevance criteria
Scalable to high-dimensional, large-scale data
Abstract
Feature selection in clustering is a hard task which involves simultaneously the discovery of relevant clusters as well as relevant variables with respect to these clusters. While feature selection algorithms are often model-based through optimised model selection or strong assumptions on the data distribution, we introduce a discriminative clustering model trying to maximise a geometry-aware generalisation of the mutual information called GEMINI with a simple l1 penalty: the Sparse GEMINI. This algorithm avoids the burden of combinatorial feature subset exploration and is easily scalable to high-dimensional data and large amounts of samples while only designing a discriminative clustering model. We demonstrate the performances of Sparse GEMINI on synthetic datasets and large-scale datasets. Our results show that Sparse GEMINI is a competitive algorithm and has the ability to select…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models · Face and Expression Recognition
MethodsFeature Selection
