Identification of Interesting Objects in Large Spectral Surveys Using   Highly Parallelized Machine Learning

Petr \v{S}koda; Andrej Pali\v{c}ka; Jakub Koza; Ksenia Shakurova

arXiv:1612.07536·astro-ph.IM·June 14, 2017·Astroinformatics

Identification of Interesting Objects in Large Spectral Surveys Using Highly Parallelized Machine Learning

Petr \v{S}koda, Andrej Pali\v{c}ka, Jakub Koza, Ksenia Shakurova

PDF

TL;DR

This paper introduces a highly parallelized machine learning approach to improve the identification of specific interesting objects, like Be stars and quasars, in large spectral surveys such as LAMOST, using spectral line shape analysis.

Contribution

It presents a novel Spark-based semi-supervised machine learning method that enhances classification reliability for boundary cases in large spectral datasets, incorporating domain adaptation with a physical model.

Findings

01

Identified dozens of Be stars, including potential new discoveries.

02

Detected spectra resembling quasars and blazars.

03

Found numerous instrumental artifacts in the survey data.

Abstract

The current archives of LAMOST multi-object spectrograph contain millions of fully reduced spectra, from which the automatic pipelines have produced catalogues of many parameters of individual objects, including their approximate spectral classification. This is, however, mostly based on the global shape of the whole spectrum and on integral properties of spectra in given bandpasses, namely presence and equivalent width of prominent spectral lines, while for identification of some interesting object types (e.g. Be stars or quasars) the detailed shape of only a few lines is crucial. Here the machine learning is bringing a new methodology capable of improving the reliability of classification of such objects even in boundary cases. We present results of Spark-based semi-supervised machine learning of LAMOST spectra attempting to automatically identify the single and double-peak emission…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.