Spectral Self-supervised Feature Selection
Daniel Segal, Ofir Lindenbaum, Ariel Jaffe

TL;DR
This paper introduces a spectral self-supervised feature selection method that leverages graph Laplacian eigenvectors and pseudo-labels to identify meaningful features in high-dimensional data, improving downstream analysis robustness.
Contribution
It presents a novel graph-based self-supervised approach for feature selection that is robust to outliers and complex data structures, with a model stability criterion for eigenvector selection.
Findings
Effective in biological datasets
Robust to outliers and complex structures
Improves downstream analysis accuracy
Abstract
Choosing a meaningful subset of features from high-dimensional observations in unsupervised settings can greatly enhance the accuracy of downstream analysis, such as clustering or dimensionality reduction, and provide valuable insights into the sources of heterogeneity in a given dataset. In this paper, we propose a self-supervised graph-based approach for unsupervised feature selection. Our method's core involves computing robust pseudo-labels by applying simple processing steps to the graph Laplacian's eigenvectors. The subset of eigenvectors used for computing pseudo-labels is chosen based on a model stability criterion. We then measure the importance of each feature by training a surrogate model to predict the pseudo-labels from the observations. Our approach is shown to be robust to challenging scenarios, such as the presence of outliers and complex substructures. We demonstrate…
Peer Reviews
Decision·ICLR 2024 Conference Withdrawn Submission
The problem that is addressed is one of the fundamental problems in machine learning, though one that has not received as much attention as related problems. The authors are motivated by a medical use-case and seem to succeed to improve existing methods to be more robust to outliers in data. The explanation of the method is easy to follow and the paper is organized well.
I have several concerns about the evaluation method. The paper claim to follow Cai et. al, but to me there seem to be important differences between their methodology and the reference, though their method seems closer to Li 2012, which I also think uses somewhat questionable evaluation metrics. Evaluating unsupervised methods is quite difficult, and the paper makes a good effort for a fair and thorough evaluation, but parts of the method are unclear and other parts seem questionably. In particu
This paper presents a new method for feature selection. Experimental results indicate the method's effectiveness.
The proposed method builds on the Multi-Cluster Feature Selection (MCFS) framework, where feature selection relies on predicting pseudo-labels from spectral clustering. However, its distinct contributions compared to existing MCFS methods remain ambiguous. On Page 5, the mathematical formulations, including the definition and indexing of $s_m$ are unclear. I am also sure the distinctions between $h_{i, b}$ and $h_i$ and the differences between $s_m$ and $\bar{s}_m$. Additionally, in Section 3
1. **Originality**: - The paper presents a fresh take on feature selection by introducing a method grounded in the filtration of eigenvectors of the graph Laplacian. Such an approach stands out due to its uniqueness in leveraging spectral properties for self-supervised feature selection. - The utilization of surrogate models as a part of the selection process adds another layer of novelty, as this is seldom seen in traditional feature selection mechanisms. 2. **Quality**: - The synt
1. **Absence of Theoretical Justification:** The paper lacks rigorous theoretical analysis supporting its central claim, specifically that "certain features might not significantly affect supervised learning, they can heavily impact unsupervised tasks." In the absence of a theoretical framework, the claim rests primarily on empirical observations. To further the contribution, the authors should delve into a theoretical discussion, offering insights into why and under what conditions their claim
The paper shows the effect in the standard eigenvector-based approach for feature selection of the presence of outliers (using real datasets) and the presence of structured noise (using synthetic data) on the quality of the embedding, highlighting the role of lower-order eigenvectors and the value of binarization in their evaluation. The numerical results (clustering accuracy) and ablation study highlight the positive effect of the proposed additional steps for these cases.
The presentation is not clear at times. For example, it takes a while to understand the motivation of the approach, which is done a bit by "reverse-engineering": once it is clear that a methodology to evaluate eigenvectors independent of the eigenvalues is established, the motivation for such a choice is presented via experiments. This could be more clearly stated in the introduction by contrasting what is being proposed here (at a high level) with the approaches from the literature. The propo
1. The idea of selecting the k most stable eigenvectors is convincing. 2. The paper is well-written and easy to follow.
1. The paper focuses on a traditional unsupervised feature selection task and tries to improve a classical feature selection method Laplacian score. I'm not sure whether this paper can attract wide interest in learning representations community. Maybe this paper is more approriate to the ICML community. 2. The compared methods are out of date. Since feature selection is a very classical task in machine learning, there are a lot of feature selection methods proposed every year. However, most of t
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition
