Identification of Interesting Objects in Large Spectral Surveys Using Highly Parallelized Machine Learning
Petr \v{S}koda, Andrej Pali\v{c}ka, Jakub Koza, Ksenia Shakurova

TL;DR
This paper introduces a highly parallelized machine learning approach to improve the identification of specific interesting objects, like Be stars and quasars, in large spectral surveys such as LAMOST, using spectral line shape analysis.
Contribution
It presents a novel Spark-based semi-supervised machine learning method that enhances classification reliability for boundary cases in large spectral datasets, incorporating domain adaptation with a physical model.
Findings
Identified dozens of Be stars, including potential new discoveries.
Detected spectra resembling quasars and blazars.
Found numerous instrumental artifacts in the survey data.
Abstract
The current archives of LAMOST multi-object spectrograph contain millions of fully reduced spectra, from which the automatic pipelines have produced catalogues of many parameters of individual objects, including their approximate spectral classification. This is, however, mostly based on the global shape of the whole spectrum and on integral properties of spectra in given bandpasses, namely presence and equivalent width of prominent spectral lines, while for identification of some interesting object types (e.g. Be stars or quasars) the detailed shape of only a few lines is crucial. Here the machine learning is bringing a new methodology capable of improving the reliability of classification of such objects even in boundary cases. We present results of Spark-based semi-supervised machine learning of LAMOST spectra attempting to automatically identify the single and double-peak emission…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
