Detecting False Positives With Derived Planetary Parameters: Experimenting with the KEPLER Dataset
Ayan Bin Rafaih, Zachary Murray

TL;DR
This study explores using derived planetary parameters instead of full light curves to improve machine learning classification of exoplanets, achieving high false positive detection rates with various models on the KEPLER dataset.
Contribution
It introduces a novel approach of using derived planetary parameters for classification, demonstrating comparable or superior performance to traditional light curve analysis.
Findings
Random Forest and CNN achieved highest accuracy.
Up to 90% false positive detection rate.
Derived parameters contain most relevant information.
Abstract
Recent developments in computational power and machine learning techniques motivate their use in many different astrophysical research areas. Consequently, many machine learning models have been trained to classify exoplanet transit signals - typically done by using time series light curves. In this work, we attempt a different approach and try to improve the efficiency of these algorithms by fitting only derived planetary parameters, instead of full time-series light curves. We investigate and evaluate 4 models (Logistic Regression, Random Forest, Support Vector Machines, and Convolutional Neural Networks) on the KEPLER dataset, using precision-recall trade-off and accuracy metrics. We show that this approach can identify up to about 90% of false positives, implying the planetary parameters encompass most of the relevant information contained in a light curve. Random Forest and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGNSS positioning and interference · Nuclear Physics and Applications · Statistical and numerical algorithms
