Clustering and Feature Selection using Sparse Principal Component   Analysis

Ronny Luss; Alexandre d'Aspremont

arXiv:0707.0701·cs.AI·October 8, 2008

Clustering and Feature Selection using Sparse Principal Component Analysis

Ronny Luss, Alexandre d'Aspremont

PDF

Open Access

TL;DR

This paper explores the use of sparse PCA for clustering and feature selection, emphasizing interpretability through sparse factors that highlight key variables in biological data.

Contribution

It applies sparse PCA to clustering and feature selection, demonstrating its effectiveness and interpretability in biological datasets.

Findings

01

Sparse PCA produces interpretable clusters with fewer variables.

02

The method effectively identifies key features in biological data.

03

Sparse factors explain significant variance with limited nonzero coefficients.

Abstract

In this paper, we study the application of sparse principal component analysis (PCA) to clustering and feature selection problems. Sparse PCA seeks sparse factors, or linear combinations of the data variables, explaining a maximum amount of variance in the data while having only a limited number of nonzero coefficients. PCA is often used as a simple clustering technique and sparse factors allow us here to interpret the clusters in terms of a reduced set of variables. We begin with a brief introduction and motivation on sparse PCA and detail our implementation of the algorithm in d'Aspremont et al. (2005). We then apply these results to some classic clustering and feature selection problems arising in biology.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene expression and cancer classification · Sparse and Compressive Sensing Techniques · Face and Expression Recognition