TL;DR
This paper introduces GalaxySD, a conditional diffusion model that synthesizes realistic galaxy images to augment training data, significantly improving galaxy morphology classification and rare object detection in astronomical surveys.
Contribution
The paper presents a novel generative model, GalaxySD, that produces high-fidelity galaxy images conditioned on morphology, enhancing machine learning performance and enabling extrapolation to unseen domains.
Findings
Improved classification metrics by up to 30% with augmented data.
Doubled detection of rare galaxy objects from 352 to 872.
Generated images closely match specified morphological features.
Abstract
Observational astronomy relies on visual feature identification to detect critical astrophysical phenomena. While machine learning (ML) increasingly automates this process, models often struggle with generalization in large-scale surveys due to the limited representativeness of labeled datasets, whether from simulations or human annotation, a challenge pronounced for rare yet scientifically valuable objects. To address this, we propose a conditional diffusion model to synthesize realistic galaxy images for augmenting ML training data (hereafter GalaxySD). Leveraging the Galaxy Zoo 2 dataset which contains visual feature, galaxy image pairs from volunteer annotation, we demonstrate that GalaxySD generates diverse, high-fidelity galaxy images that closely adhere to the specified morphological feature conditions. Moreover, this model enables generative extrapolation to project…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion
