Does Machine Learning Work? A Comparative Analysis of Strong Gravitational Lens Searches in the Dark Energy Survey

J. Gonzalez; T. Collett; K. Rojas; K. Bechtol; J. A. Acevedo Barroso; A. Melo; A. More; D. Sluse; C. Tortora; P. Holloway; N. E. P. Lines; A. Verma

arXiv:2510.23782·astro-ph.GA·October 29, 2025

Does Machine Learning Work? A Comparative Analysis of Strong Gravitational Lens Searches in the Dark Energy Survey

J. Gonzalez, T. Collett, K. Rojas, K. Bechtol, J. A. Acevedo Barroso, A. Melo, A. More, D. Sluse, C. Tortora, P. Holloway, N. E. P. Lines, A. Verma

PDF

TL;DR

This study compares three machine learning methods for identifying strong gravitational lenses in the Dark Energy Survey, demonstrating that combining diverse classifiers enhances detection completeness and reduces false positives.

Contribution

It provides a systematic evaluation of different ML architectures and ensemble strategies, showing how their combination improves lens detection performance.

Findings

01

Model performance improved from F1-scores of 0.31 to 0.54.

02

Ensemble methods recover up to 82% of lens candidates.

03

Combining classifiers reduces false positives significantly.

Abstract

We present a systematic comparison of three independent machine learning (ML)-based searches for strong gravitational lenses applied to the Dark Energy Survey (Jacobs et al. 2019a,b; Rojas et al. 2022; Gonzalez et al. 2025). Each search employs a distinct ML architecture and training strategy, allowing us to evaluate their relative performance, completeness, and complementarity. Using a visually inspected sample of 1651 systems previously reported as lens candidates, we assess how each model scores these systems and quantify their agreement with expert classifications. The three models show progressive improvement in performance, with F1-scores of 0.31, 0.35, and 0.54 for Jacobs, Rojas, and Gonzalez, respectively. Their completeness for moderate- to high-confidence lens candidates follows a similar trend (31%, 52%, and 70%). When combined, the models recover 82% of all such systems,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.