Identifying Radio Active Galactic Nuclei with Machine Learning and Large-Area Surveys
Xu-Liang Fan, Jie Li

TL;DR
This paper develops a supervised machine learning classifier, using optical and mid-infrared features, to distinguish radio AGNs from star-forming galaxies in large-area surveys with high accuracy and reliability.
Contribution
It introduces a high-performance classifier trained on multiwavelength data, achieving high precision and recall, and applies it to large survey datasets for reliable AGN identification.
Findings
The CatBoost classifier achieved precision of 97.4% for AGNs.
The classifier's recall for AGNs is 86.5%.
Applied to large datasets, it identified nearly 50,000 AGNs with high reliability.
Abstract
Context. Active galactic nuclei (AGNs) and star forming galaxies (SFGs) are the primary sources of extragalactic radio sky. But it is difficult to distinguish the radio emission produced by AGNs from that by SFGs, especially when the radio sources are faint. Best et al. (2023) classified the radio sources in LoTSS Deep Fields DR1 through multiwavelength SED fitting. With the classification results of them, we perform a supervised machine learning to distinguish radio AGNs and radio SFGs. Aims. We aim to provide a supervised classifier to identify radio AGNs, which can get both high purity and completeness simultaneously, and can easily be applied to datasets of large-area surveys. Methods. The classifications of Best et al. (2023) are used as the true labels for supervised machine learning. With the cross-matched sample of LoTSS Deep Fields DR1, AllWISE and Gaia DR3, the features of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
