QSO Selection Algorithm Using Time Variability and Machine Learning:   Selection of 1,620 QSO Candidates from MACHO LMC Database

Dae-Won Kim; Pavlos Protopapas; Yong-Ik Byun; Charles Alcock; Roni; Khardon; Markos Trichas

arXiv:1101.3316·astro-ph.IM·May 27, 2015

QSO Selection Algorithm Using Time Variability and Machine Learning: Selection of 1,620 QSO Candidates from MACHO LMC Database

Dae-Won Kim, Pavlos Protopapas, Yong-Ik Byun, Charles Alcock, Roni, Khardon, Markos Trichas

PDF

TL;DR

This paper introduces a machine learning-based algorithm using SVMs to identify QSO candidates from large astronomical datasets, achieving high accuracy and low false positives in the MACHO LMC survey.

Contribution

The study develops a novel SVM-based QSO selection method utilizing time series features, demonstrating effective identification of QSO candidates in a massive dataset.

Findings

01

Correctly identified ~80% of known QSOs with 25% false positives

02

Found 1,620 QSO candidates from 40 million lightcurves

03

Over 70% of candidates are likely true QSOs

Abstract

We present a new QSO selection algorithm using a Support Vector Machine (SVM), a supervised classification method, on a set of extracted times series features including period, amplitude, color, and autocorrelation value. We train a model that separates QSOs from variable stars, non-variable stars and microlensing events using 58 known QSOs, 1,629 variable stars and 4,288 non-variables using the MAssive Compact Halo Object (MACHO) database as a training set. To estimate the efficiency and the accuracy of the model, we perform a cross-validation test using the training set. The test shows that the model correctly identifies ~80% of known QSOs with a 25% false positive rate. The majority of the false positives are Be stars. We applied the trained model to the MACHO Large Magellanic Cloud (LMC) dataset, which consists of 40 million lightcurves, and found 1,620 QSO candidates. During the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.