Unsupervised Variable Selection for Ultrahigh-Dimensional Clustering   Analysis

Tonglin Zhang; Huyunting Huang

arXiv:2411.19448·stat.ME·December 2, 2024

Unsupervised Variable Selection for Ultrahigh-Dimensional Clustering Analysis

Tonglin Zhang, Huyunting Huang

PDF

Open Access

TL;DR

This paper introduces an unsupervised variable selection method, FPCFL, that improves clustering performance in ultrahigh-dimensional data by effectively distinguishing informative variables from uninformative ones.

Contribution

The paper proposes the FPCFL method for unsupervised variable selection, capable of identifying active, redundant, and uninformative variables, enhancing clustering accuracy.

Findings

01

FPCFL outperforms existing methods in simulations.

02

Excluding uninformative variables improves clustering results.

03

Selecting a relevant subset can match full-variable clustering performance.

Abstract

Compared to supervised variable selection, the research on unsupervised variable selection is far behind. A forward partial-variable clustering full-variable loss (FPCFL) method is proposed for the corresponding challenges. An advantage is that the FPCFL method can distinguish active, redundant, and uninformative variables, which the previous methods cannot achieve. Theoretical and simulation studies show that the performance of a clustering method using all the variables can be worse if many uninformative variables are involved. Better results are expected if the uninformative variables are excluded. The research addresses a previous concern about how variable selection affects the performance of clustering. Rather than many previous methods attempting to select all the relevant variables, the proposed method selects a subset that can induce an equally good result. This phenomenon does…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Face and Expression Recognition