A Semi-Supervised Learning Method for the Identification of Bad Exposures in Large Imaging Surveys
Yufeng Luo, Adam D. Myers, Alex Drlica-Wagner, Dario Dematties, Salma Borchani, Francisco Valdes, Arjun Dey, David Schlegel, Rongpu Zhou, and DESI Legacy Imaging Surveys Team

TL;DR
This paper presents a semi-supervised machine learning pipeline combining vision transformers and kNN to efficiently identify poor-quality astronomical images in large surveys, reducing reliance on manual inspection.
Contribution
The authors develop a novel semi-supervised approach using vision transformers and clustering analysis for scalable image quality assessment in large astronomical surveys.
Findings
Successfully identified 780 problematic exposures in DECaLS DR11
Pipeline achieves high accuracy in classifying image quality
Method reduces manual effort in large-scale survey data quality control
Abstract
As the data volume of astronomical imaging surveys rapidly increases, traditional methods for image anomaly detection, such as visual inspection by human experts, are becoming impractical. We introduce a machine-learning-based approach to detect poor-quality exposures in large imaging surveys, with a focus on the DECam Legacy Survey (DECaLS) in regions of low extinction (i.e., ). Our semi-supervised pipeline integrates a vision transformer (ViT), trained via self-supervised learning (SSL), with a k-Nearest Neighbor (kNN) classifier. We train and validate our pipeline using a small set of labeled exposures observed by surveys with the Dark Energy Camera (DECam). A clustering-space analysis of where our pipeline places images labeled in ``good'' and ``bad'' categories suggests that our approach can efficiently and accurately determine the quality of exposures. Applied to new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
