# Variable selection in model-based clustering and discriminant analysis   with a regularization approach

**Authors:** Gilles Celeux, Cathy Maugis-Rabusseau, Mohammed Sedki

arXiv: 1705.00946 · 2017-05-03

## TL;DR

This paper introduces a regularization-based variable selection method for model-based clustering and classification, replacing slow stepwise procedures with a lasso-like ranking to efficiently handle high-dimensional data.

## Contribution

It proposes a novel regularization approach that speeds up variable selection in clustering and classification, enabling application to large, high-dimensional datasets.

## Key findings

- Efficient variable ranking using a lasso-like procedure.
- Application of the method to high-dimensional data sets.
- Improved computational speed over traditional stepwise methods.

## Abstract

Relevant methods of variable selection have been proposed in model-based clustering and classification. These methods are making use of backward or forward procedures to define the roles of the variables. Unfortunately, these stepwise procedures are terribly slow and make these variable selection algorithms inefficient to treat large data sets. In this paper, an alternative regularization approach of variable selection is proposed for model-based clustering and classification. In this approach, the variables are first ranked with a lasso-like procedure in order to avoid painfully slow stepwise algorithms. Thus, the variable selection methodology of Maugis et al (2009b) can be efficiently applied on high-dimensional data sets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.00946/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1705.00946/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/1705.00946/full.md

---
Source: https://tomesphere.com/paper/1705.00946