An automated method for finding the most distant quasars
Lena Lenz, Daniel J. Mortlock, Boris Leistedt, Rhys Barnett, Paul C. Hewett

TL;DR
This paper presents an automated, objective pipeline combining Bayesian model comparison and image goodness-of-fit tests to identify high-redshift quasars efficiently in upcoming large surveys, validated on simulated and real data.
Contribution
It introduces a novel automated selection method for distant quasars using combined flux and image analysis, suitable for future survey data.
Findings
Achieved an AUC score of up to 0.81 on real data.
Attained an F_3 score of up to 0.79, demonstrating high accuracy.
Can reach an efficiency of 0.15 at 90% completeness.
Abstract
Upcoming surveys such as Euclid, the Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST) and the Nancy Grace Roman Telescope (Roman) will detect hundreds of high-redshift (z > 7) quasars, but distinguishing them from the billions of other sources in these catalogues represents a significant data analysis challenge. We address this problem by extending existing selection methods by using both i) Bayesian model comparison on measured fluxes and ii) a likelihood-based goodness-of-fit test on images, which are then combined using the F_beta statistic (where beta is a parameter which can be tuned to prioritise completeness). The result is an automated, reproduceable and objective high-redshift quasar selection pipeline. We test this on both simulations and real data from the cross-matched Sloan Digital Sky Survey (SDSS) and UKIRT Infrared Deep Sky Survey (UKIDSS) catalogues.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
