Imbalanced learning for RR Lyrae stars
Jingyi Zhang, Yanxia Zhang, Yongheng Zhao

TL;DR
This paper presents a machine learning approach combined with convex hull algorithms to effectively identify RR Lyrae stars from large astronomical datasets, demonstrating improved accuracy by leveraging multi-band photometric data and addressing class imbalance.
Contribution
It introduces a novel combination of convex hull and machine learning techniques for RR Lyrae star classification, highlighting the importance of multi-band data and cost-sensitive algorithms.
Findings
Convex hull method achieves up to 16.1% efficiency with 53% completeness.
GALEX ultraviolet data improves RR Lyrae star identification.
Fast Boxes algorithm performs best on imbalanced astronomical data.
Abstract
We apply machine learning and Convex-Hull algorithms to separate RR Lyrae stars from other stars, like main sequence stars, white dwarf stars, carbon stars, CVs and carbon-lines stars, based on the Sloan Digital Sky Survey (SDSS) and Galaxy Evolution Explorer (GALEX). In the low-dimensional space, the Convex-Hull algorithm is applied to select RR Lyrae stars. Given different input patterns of (u-g, g-r), (g-r, r-i), (r-i, i-z), (u-g, g-r, r-i), (g-r, r-i, i-z), (u-g, g-r, i-z) and (u-g, r-i, i-z), different convex hulls can be built for RR Lyrae stars. Comparing the performance of different input patterns, u-g, g-r, i-z is the best input pattern. For this input pattern, the efficiency (the fraction of true RR Lyrae stars in the predicted RR Lyrae sample) is 4.2% with a completeness (the fraction of recovered RR Lyrae stars in the whole RR Lyrae sample) of 100%, increases to 9.9% with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSAS software applications and methods
