Scalable unsupervised feature selection via weight stability

Xudong Zhang; Renato Cordeiro de Amorim

arXiv:2506.06114·cs.LG·May 19, 2026

Scalable unsupervised feature selection via weight stability

Xudong Zhang, Renato Cordeiro de Amorim

PDF

1 Repo

TL;DR

This paper introduces scalable unsupervised feature selection methods that leverage weight stability across Minkowski exponents, improving clustering in high-dimensional data.

Contribution

It proposes new algorithms, FS-MWK++ and SFS-MWK++, with theoretical guarantees for identifying relevant features across Minkowski exponents.

Findings

01

The algorithms effectively distinguish relevant from noise features.

02

Theoretical analysis confirms consistent feature weighting under certain conditions.

03

Software implementation is publicly available at the provided GitHub link.

Abstract

Unsupervised feature selection is critical for improving clustering performance in high-dimensional data, where irrelevant features can obscure meaningful structure. In this work, we introduce the Minkowski weighted $k$ -means++, a novel initialisation strategy for the Minkowski Weighted $k$ -means. Our initialisation selects centroids probabilistically using feature relevance estimates derived from the data itself. Building on this, we propose two new feature selection algorithms, FS-MWK++, which aggregates feature weights across a range of Minkowski exponents to identify stable and informative features, and SFS-MWK++, a scalable variant based on subsampling. We support our approach with a theoretical analysis, demonstrating that, under explicit assumptions on noise features and cluster structure, relevant features are assigned consistently higher weights than noise features across a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xzhang4-ops1/FSMWK
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.