Learning novel representations of variable sources from multi-modal $\textit{Gaia}$ data via autoencoders

P. Huijse; J. De Ridder; L. Eyer; L. Rimoldini; B. Holl; N. Chornay; J. Roquette; K. Nienartowicz; G. Jevardat de Fombelle; D. J. Fritzewski; A. Kemp; V. Vanlaer; M. Vanrespaille; H. Wang; M.I. Carnerero; C.M. Raiteri; G. Marton; M. Madar\'asz; G. Clementini; P. Gavras; C. Aerts

arXiv:2505.16320·astro-ph.IM·September 16, 2025

Learning novel representations of variable sources from multi-modal $\textit{Gaia}$ data via autoencoders

P. Huijse, J. De Ridder, L. Eyer, L. Rimoldini, B. Holl, N. Chornay, J. Roquette, K. Nienartowicz, G. Jevardat de Fombelle, D. J. Fritzewski, A. Kemp, V. Vanlaer, M. Vanrespaille, H. Wang, M.I. Carnerero, C.M. Raiteri, G. Marton, M. Madar\'asz, G. Clementini, P. Gavras, C. Aerts

PDF

TL;DR

This paper introduces a machine learning approach using variational autoencoders to combine multiple Gaia DR3 data products for unsupervised classification and analysis of variable sources, revealing astrophysical insights.

Contribution

It develops a novel multi-modal autoencoder framework that integrates spectral, photometric, and light curve data for improved variability classification.

Findings

01

Effective separation of variability classes in latent space

02

Strong correlation between latent features and astrophysical properties

03

Enhanced variability analysis through combined data representations

Abstract

Gaia Data Release 3 (DR3) published for the first time epoch photometry, BP/RP (XP) low-resolution mean spectra, and supervised classification results for millions of variable sources. This extensive dataset offers a unique opportunity to study their variability by combining multiple Gaia data products. In preparation for DR4, we propose and evaluate a machine learning methodology capable of ingesting multiple Gaia data products to achieve an unsupervised classification of stellar and quasar variability. A dataset of 4 million Gaia DR3 sources is used to train three variational autoencoders (VAE), which are artificial neural networks (ANNs) designed for data compression and generation. One VAE is trained on Gaia XP low-resolution spectra, another on a novel approach based on the distribution of magnitude differences in the Gaia G band, and the third on folded Gaia G band light curves.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.