Improved source classification and performance analysis using Gaia DR3
Sara Jamal, Coryn A. L. Bailer-Jones

TL;DR
This paper enhances Gaia DR3 source classification by developing new probabilistic methods and combining classifiers, significantly improving purity especially for faint quasars, while analyzing performance variations across sky regions.
Contribution
It introduces a new additive classifier combination and a variable prior, leading to improved purity and completeness in Gaia DR3 extragalactic source classification.
Findings
Achieved 55% and 89% purity for quasars and galaxies with Combmod.
Significant purity increase for faint quasars from 20% to 62%.
Enhanced classification performance by combining classifiers and using a variable prior.
Abstract
The Discrete Source Classifier (DSC) provides probabilistic classification of sources in Gaia Data Release 3 using a Bayesian framework and a global prior. The DSC Combmod classifier in GDR3 achieved for the extragalactic classes (quasars and galaxies) a high completeness of 92%, but a low purity of 22% due to contamination from the far larger star class. However, these single metrics mask significant variation in performance with magnitude and sky position. Furthermore, a better combination of the individual classifiers is possible. Here we compute two-dimensional representations of the completeness and the purity as function of Galactic latitude and source brightness, and also exclude the Magellanic Clouds where stellar contamination significantly reduces the purity. Reevaluated on a cleaner validation set and without introducing changes to the published GDR3 DSC probabilities…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
