Detecting outliers and learning complex structures with large spectroscopic surveys - a case study with APOGEE stars
Itamar Reis, Dovi Poznanski, Dalya Baron, Gail Zasowski, Sahar Shahaf

TL;DR
This paper applies an unsupervised random forest algorithm to APOGEE stellar spectra to identify outliers, uncover complex structures, and discover rare and previously unknown stellar objects, demonstrating machine learning's potential in astronomical data analysis.
Contribution
It expands on a novel outlier detection method to analyze large spectroscopic surveys, revealing new stellar objects and complex data structures.
Findings
Identified previously unknown Be-type stars and spectroscopic binaries.
Detected rare and extreme stellar objects beyond model capabilities.
Showed the similarity measure correlates with physical properties and structures.
Abstract
In this work we apply and expand on a recently introduced outlier detection algorithm that is based on an unsupervised random forest. We use the algorithm to calculate a similarity measure for stellar spectra from the Apache Point Observatory Galactic Evolution Experiment (APOGEE). We show that the similarity measure traces non-trivial physical properties and contains information about complex structures in the data. We use it for visualization and clustering of the dataset, and discuss its ability to find groups of highly similar objects, including spectroscopic twins. Using the similarity matrix to search the dataset for objects allows us to find objects that are impossible to find using their best fitting model parameters. This includes extreme objects for which the models fail, and rare objects that are outside the scope of the model. We use the similarity measure to detect outliers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
