Realistic galaxy images and improved robustness in machine learning   tasks from generative modelling

Benjamin J. Holzschuh; Conor M. O'Riordan; Simona Vegetti; Vicente; Rodriguez-Gomez; and Nils Thuerey

arXiv:2203.11956·astro-ph.GA·July 27, 2022

Realistic galaxy images and improved robustness in machine learning tasks from generative modelling

Benjamin J. Holzschuh, Conor M. O'Riordan, Simona Vegetti, Vicente, Rodriguez-Gomez, and Nils Thuerey

PDF

TL;DR

This paper demonstrates that generative models can produce highly realistic galaxy images, and mixing these with real data enhances the robustness of machine learning models against domain shifts and out-of-distribution data.

Contribution

The study introduces a method of augmenting training data with generative galaxy images to improve model robustness in astrophysical tasks.

Findings

01

Generated galaxy images closely match real data properties.

02

Mixing generated data improves model robustness against domain shifts.

03

Generative models produce visually indistinguishable galaxy images.

Abstract

We examine the capability of generative models to produce realistic galaxy images. We show that mixing generated data with the original data improves the robustness in downstream machine learning tasks. We focus on three different data sets; analytical S\'ersic profiles, real galaxies from the COSMOS survey, and galaxy images produced with the SKIRT code, from the IllustrisTNG simulation. We quantify the performance of each generative model using the Wasserstein distance between the distributions of morphological properties (e.g. the Gini-coefficient, the asymmetry, and ellipticity), the surface brightness distribution on various scales (as encoded by the power-spectrum), the bulge statistic and the colour for the generated and source data sets. With an average Wasserstein distance (Fr\'echet Inception Distance) of $7.19 \times 1 0^{- 2} (0.55)$ , $5.98 \times 1 0^{- 2} (1.45)$ and $5.08…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.