Feature selection for high dimensional data in astronomy
H. Zheng, Y. Zhang

TL;DR
This paper reviews feature selection methods for high-dimensional astronomical data, compares their performance and computational costs, and highlights the importance of choosing suitable methods and algorithms for effective data analysis.
Contribution
It provides a comprehensive taxonomy of feature selection methods and a case study comparing filter and wrapper approaches in astronomical data analysis.
Findings
Filter methods are computationally more efficient than wrapper methods.
Different learning algorithms combined with feature selection can improve performance.
Filter methods like ReliefF and Fisher filter perform well in high-dimensional settings.
Abstract
With an exponentially increasing amount of astronomical data, the complexity and dimension of astronomical data are likewise growing rapidly. Extracting information from such data becomes a critical and challenging problem. For example, some algorithms can only be employed in the low-dimensional spaces, so feature selection and feature extraction become important topics. Here we describe the difference between feature selection and feature extraction methods, and introduce the taxonomy of feature selection methods as well as the characteristics of each method. We present a case study comparing the performance and computational cost of different feature selection methods. For the filter method, ReliefF and fisher filter are adopted; for the wrapper method, improved CHAID, linear discriminant analysis (LDA), Naive Bayes (NB) and C4.5 are taken as learners. Applied on the sample, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpectroscopy and Chemometric Analyses
