An Efficient Method for Rare Spectra Retrieval in Astronomical Databases

Changde Du; Ali Luo; Haifeng Yang; Wen Hou; Yanxin Guo

arXiv:1603.04544·astro-ph.GA·March 23, 2016

An Efficient Method for Rare Spectra Retrieval in Astronomical Databases

Changde Du, Ali Luo, Haifeng Yang, Wen Hou, Yanxin Guo

PDF

TL;DR

This paper introduces a novel rank-based method combining bipartite ranking and bootstrap aggregating to efficiently identify rare spectral objects like carbon stars in large astronomical datasets, outperforming existing techniques.

Contribution

The paper presents a new hybrid ranking method that improves rare object detection in massive spectral datasets, addressing limitations of binary classification approaches.

Findings

01

The proposed method is more effective than existing techniques.

02

It is less time-consuming in large-scale data searches.

03

Experimental validation on SDSS data confirms superior performance.

Abstract

One of important aims of astronomical data mining is to systematically search for specific rare objects in a massive spectral dataset, given a small fraction of identified samples with the same type. Most existing methods are mainly based on binary classification, which usually suffer from uncompleteness when the known samples are too few. While, rank-based methods would provide good solutions for such case. After investigating several algorithms, a method combining bipartite ranking model with bootstrap aggregating techniques was developed in this paper. The method was applied in searching for carbon stars in the spectral data of Sloan Digital Sky Survey (SDSS) DR10, and compared with several other popular methods used in data mining. Experimental results validate that the proposed method is not only the most effective but also less time consuming among these competitors automatically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.