Supervised Machine Learning Methods with Uncertainty Quantification for Exoplanet Atmospheric Retrievals from Transmission Spectroscopy
Roy T. Forestano, Konstantin T. Matchev, Katia Matcheva, Eyup B. Unlu

TL;DR
This paper systematically compares various machine learning regression techniques for exoplanet atmospheric retrievals from transmission spectra, focusing on accuracy, speed, and uncertainty quantification, to provide efficient alternatives to Bayesian methods.
Contribution
It evaluates multiple ML algorithms and preprocessing methods, identifying the best combination for exoplanet atmospheric parameter retrievals from transmission spectra.
Findings
Support vector machines and random forests perform best in accuracy and speed.
Preprocessing significantly impacts model performance and uncertainty estimates.
The optimal ML model was validated on JWST observations of WASP-39b.
Abstract
Standard Bayesian retrievals for exoplanet atmospheric parameters from transmission spectroscopy, while well understood and widely used, are generally computationally expensive. In the era of the JWST and other upcoming observatories, machine learning approaches have emerged as viable alternatives that are both efficient and robust. In this paper we present a systematic study of several existing machine learning regression techniques and compare their performance for retrieving exoplanet atmospheric parameters from transmission spectra. We benchmark the performance of the different algorithms on the accuracy, precision, and speed. The regression methods tested here include partial least squares (PLS), support vector machines (SVM), k nearest neighbors (KNN), decision trees (DT), random forests (RF), voting (VOTE), stacking (STACK), and extreme gradient boosting (XGB). We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
