Using Machine Learning for Discovery in Synoptic Survey Imaging
Henrik Brink, Joseph W. Richards, Dovi Poznanski, Joshua S. Bloom,, John Rice, Sahand Negahban, Martin Wainwright

TL;DR
This paper presents a machine learning framework for real-time identification of astrophysical variability in large-scale sky surveys, improving detection accuracy amidst noise and artefacts.
Contribution
The work introduces a probabilistic ML approach with optimized features that is robust to training data contamination, enhancing discovery efficiency in time-domain astronomy.
Findings
Achieved at most 7.7% missed detection rate at 1% false-positive rate.
Developed a feature set of 23 attributes from an initial 42, avoiding over-fitting.
Framework is tolerant to up to 10% label contamination in training data.
Abstract
Modern time-domain surveys continuously monitor large swaths of the sky to look for astronomical variability. Astrophysical discovery in such data sets is complicated by the fact that detections of real transient and variable sources are highly outnumbered by bogus detections caused by imperfect subtractions, atmospheric effects and detector artefacts. In this work we present a machine learning (ML) framework for discovery of variability in time-domain imaging surveys. Our ML methods provide probabilistic statements, in near real time, about the degree to which each newly observed source is astrophysically relevant source of variable brightness. We provide details about each of the analysis steps involved, including compilation of the training and testing sets, construction of descriptive image-based and contextual features, and optimization of the feature subset and model tuning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
