Evaluating the feasibility of interpretable machine learning for globular cluster detection
Dominik Dold, Katja Fahrion

TL;DR
This study assesses machine learning techniques for efficiently detecting extragalactic globular clusters in large photometric datasets, demonstrating high accuracy and transferability across different galaxy environments.
Contribution
It introduces the application of interpretable ML models to globular cluster detection, showing they can match existing catalogues and generalize across datasets.
Findings
ML models recover 90-94% of GCs with 6-8% false positives
High performance (98-99%) in the magnitude range 22-24.5 mag
Models are transferable between different galaxy clusters
Abstract
Extragalactic globular clusters (GCs) are important tracers of galaxy formation and evolution. Obtaining GC catalogues from photometric data involves several steps which will likely become too time-consuming to perform on the large data volumes that are expected from upcoming wide-field imaging projects such as Euclid. In this work, we explore the feasibility of various machine learning (ML) methods to aid the search for GCs. We use archival Hubble Space Telescope data in the F475W and F850LP bands of 141 early-type galaxies in the Fornax and Virgo galaxy clusters. Using existing GC catalogues to label the data, we obtain an extensive data set of 84929 sources containing 18556 GCs and we train several ML methods both on image and tabular data containing physically relevant features extracted from the images. We find that our evaluated ML models are capable of producing catalogues of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGalaxies: Formation, Evolution, Phenomena · Topological and Geometric Data Analysis · Machine Learning and Data Classification
