Machine Learning Classification of Gaia Data Release 2

Yu Bai; JiFeng Liu; Song Wang

arXiv:1808.05728·astro-ph.SR·October 17, 2018

Machine Learning Classification of Gaia Data Release 2

Yu Bai, JiFeng Liu, Song Wang

PDF

TL;DR

This paper applies machine learning classification to Gaia DR2 data combined with Pan-STARRS 1 and AllWISE, achieving high accuracy in distinguishing stars, galaxies, and QSOs, and providing insights into data quality and object classification.

Contribution

The study demonstrates the effectiveness of machine learning in classifying over 85 million Gaia DR2 objects with high accuracy, integrating multi-survey data for improved astrophysical object identification.

Findings

01

Classification accuracy of 91.9% across the dataset

02

Stars constitute approximately 98% of classified objects

03

A threshold of 0 < σπ/π < 0.2 yields a very clean stellar sample

Abstract

Machine learning has increasingly gained more popularity with its incredibly powerful ability to make predictions or calculated suggestions for large amounts of data. We apply the machine learning classification to 85,613,922 objects in the $G aia$ data release 2, based on the combination of the Pan-STARRS 1 and AllWISE data. The classification results are cross-matched with Simbad database, and the total accuracy is 91.9%. Our sample is dominated by stars, $\sim$ 98%, and galaxies makes up 2%. For the objects with negative parallaxes, about 2.5\% are galaxies and QSOs, while about 99.9% are stars if the relative parallax uncertainties are smaller than 0.2. Our result implies that using the threshold of 0 $< σ_{π} / π <$ 0.2 could yield a very clean stellar sample.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.