Study on Outliers in the Big Stellar Spectral Dataset of the Fifth Data Release (DR5) of the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST)
Yan Lu, A-Li Luo, Li-Li Wang, Li Qin, Rui Wang, Xiang-Lei Chen, Bing, Du, Fang Zuo, Wen Hou, Jian-Jun Chen, Yan-Ke Tang, Jin-Shu Han, Yong-Heng, Zhao

TL;DR
This study applies an improved outlier detection method to over 3 million stellar spectra from LAMOST DR5, identifying and analyzing the most outlier spectra to assess data quality and parameter accuracy.
Contribution
It introduces an enhanced Local Outlier Factor method using PCA and Monte Carlo for large-scale spectral outlier detection in astronomical datasets.
Findings
Identified 3,627 outlier spectra, about 0.1% of the dataset.
Outliers are uniformly distributed across parameter space.
Most outliers show signs of data issues like nebular contamination.
Abstract
To study the quality of stellar spectra of the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) and the correctness of the corresponding stellar parameters derived by the LASP (LAMOST Stellar Parameter Pipeline), the outlier analysis method is applied to the archived AFGK stars in the fifth data release (DR5) of LAMOST. The outlier factor is defined in order to sort more than 3 million stellar spectra selected from the DR5 Stellar Parameter catalog. We propose an improved Local Outlier Factor (LOF) method based on Principal Component Analysis and Monte Carlo to enable the computation of the LOF rankings for randomly picked sub-samples that are computed in parallel by multiple computers, and finally to obtain the outlier ranking of each spectrum in the entire dataset. Totally 3,627 most outlier ranked spectra, around one-thousandth of all spectra, are selected and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
