Objective-Induced Bias and Search Dynamics in Multiobjective Unsupervised Feature Selection
Mathieu Cherpitel, Thomas B\"ack, Martijn R. Tannemaat, Anna V. Kononova

TL;DR
This paper investigates how different objective formulations influence search behavior and outcomes in multiobjective unsupervised feature selection, highlighting the importance of objective design for effective subset selection.
Contribution
It introduces a PCA reconstruction loss as an objective that yields compact, high-quality feature subsets comparable to supervised methods, improving understanding of search dynamics.
Findings
Silhouette-based objectives bias towards trivial solutions.
PCA loss produces compact subsets with high test accuracy.
Objective choice critically impacts search behavior and solution quality.
Abstract
Unsupervised feature selection is commonly formulated as a multiobjective optimisation problem that jointly optimises subset quality and subset size. Yet the behaviour of this formulation depends critically on the choice of evaluation objective, the direction of subset-size regularisation, and the initialisation strategy. We study these factors in a controlled setting using a synthetic dataset with known informative, redundant, and irrelevant feature types. Six formulations are compared by combining three evaluation objectives: accuracy, silhouette score, and PCA reconstruction loss with subset-size minimisation or maximisation. The results show that formulation strongly affects both search dynamics and the quality of the resulting Pareto front. Silhouette-based formulations exhibit a strong bias toward trivial low-cardinality solutions and remain weak proxies for predictive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
