Unsupervised classification of SDSS galaxy spectra
Didier Fraix-Burnet (IPAG), C. Bouveyron (JAD), J. Moultaka (IRAP)

TL;DR
This paper presents an unsupervised method to classify over 700,000 SDSS galaxy spectra into 86 classes, providing a new, objective way to generate galaxy templates and organize large spectral datasets.
Contribution
It introduces a novel unsupervised classification approach using Fisher-EM on large galaxy spectra datasets, establishing a robust, automatic classification system.
Findings
86 optimal classes identified from spectra
Classification agrees with literature templates with ~85% accuracy
Mean spectra serve as effective galaxy templates
Abstract
Defining templates of galaxy spectra is useful to quickly characterise new observations and organise databases from surveys. These templates are usually built from a pre-defined classification based on other criteria. Aims. We present an unsupervised classification of 702248 spectra of galaxies and quasars with redshifts smaller than 0.25 that were retrieved from the Sloan Digital Sky Survey (SDSS) database, release 7. The spectra were first corrected for redshift, then wavelet-filtered to reduce the noise, and finally binned to obtain about 1437 wavelengths per spectrum. The unsupervised clustering algorithm Fisher-EM, relying on a discriminative latent mixture model, was applied on these corrected spectra. The full set and several subsets of 100000 and 300000 spectra were analysed. The optimum number of classes given by a penalised likelihood criterion is 86 classes, of which the 37…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
