Extracting an Informative Latent Representation of High-Dimensional Galaxy Spectra
Daiki Iwasaki, Suchetha Cooray, Tsutomu T. Takeuchi

TL;DR
This paper identifies four key latent variables that efficiently capture the diversity of galaxy spectra, offering a more fundamental representation than traditional physical properties, and highlights the spectral regions most informative for this task.
Contribution
The study introduces a novel four-dimensional latent space for galaxy spectra and demonstrates its effectiveness using conditional variational autoencoders trained on SDSS data.
Findings
Four latent variables effectively represent galaxy spectral diversity.
Spectral regions below 5000 Å and emission lines are highly informative.
Complex models with many parameters are hard to constrain with spectroscopic data.
Abstract
To understand the fundamental parameters of galaxy evolution, we investigated the minimum set of parameters that explain the observed galaxy spectra in the local Universe. We identified four latent variables that efficiently represent the diversity of high-dimensional galaxy spectral energy distributions (SEDs) observed by the Sloan Digital Sky Survey. Additionally, we constructed meaningful latent representation using conditional variational autoencoders trained with different permutations of galaxy physical properties, which helped us quantify the information that these traditionally used properties have on the reconstruction of galaxy spectra. The four parameters suggest a view that complex SED population models with a very large number of parameters will be difficult to constrain even with spectroscopic galaxy data. Through an Explainable AI (XAI) method, we found that the region…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Time Series Analysis and Forecasting · Data Analysis with R
