Exoplanet Detection Using Machine Learning Models Trained on Synthetic Light Curves

Ethan Lo; Dan C. Lo

arXiv:2507.19520·cs.LG·July 29, 2025

Exoplanet Detection Using Machine Learning Models Trained on Synthetic Light Curves

Ethan Lo, Dan C. Lo

PDF

TL;DR

This paper explores the use of machine learning models trained on synthetic light curves to improve exoplanet detection efficiency, demonstrating promising results and emphasizing the importance of data augmentation for better accuracy.

Contribution

It evaluates simple ML models like logistic regression, k-nearest neighbors, and random forest for exoplanet detection using NASA data, highlighting the impact of data augmentation.

Findings

01

Data augmentation improves recall and precision.

02

ML models show promising initial results.

03

Accuracy varies across models.

Abstract

With manual searching processes, the rate at which scientists and astronomers discover exoplanets is slow because of inefficiencies that require an extensive time of laborious inspections. In fact, as of now there have been about only 5,000 confirmed exoplanets since the late 1900s. Recently, machine learning (ML) has proven to be extremely valuable and efficient in various fields, capable of processing massive amounts of data in addition to increasing its accuracy by learning. Though ML models for discovering exoplanets owned by large corporations (e.g. NASA) exist already, they largely depend on complex algorithms and supercomputers. In an effort to reduce such complexities, in this paper, we report the results and potential benefits of various, well-known ML models in the discovery and validation of extrasolar planets. The ML models that are examined in this study include logistic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.