Monte Carlo Techniques for Addressing Large Errors and Missing Data in Simulation-based Inference
Bingjie Wang, Joel Leja, Ashley Villar, Joshua S. Speagle

TL;DR
This paper introduces a Monte Carlo method to handle missing data and out-of-distribution errors in simulation-based inference, significantly improving efficiency and applicability for large astronomical datasets.
Contribution
The authors develop a Monte Carlo technique that allows SBI to effectively manage heterogeneous data with missing observations and variable uncertainties, expanding its use in astronomy.
Findings
Out-of-distribution errors can be approximated with SBI evaluations.
Missing data can be marginalized using nearby data realizations.
Inference time increases from 1 second to 1.5 minutes per object, still faster than traditional methods.
Abstract
Upcoming astronomical surveys will observe billions of galaxies across cosmic time, providing a unique opportunity to map the many pathways of galaxy assembly to an incredibly high resolution. However, the huge amount of data also poses an immediate computational challenge: current tools for inferring parameters from the light of galaxies take hours per fit. This is prohibitively expensive. Simulation-based Inference (SBI) is a promising solution. However, it requires simulated data with identical characteristics to the observed data, whereas real astronomical surveys are often highly heterogeneous, with missing observations and variable uncertainties determined by sky and telescope conditions. Here we present a Monte Carlo technique for treating out-of-distribution measurement errors and missing data using standard SBI tools. We show that out-of-distribution measurement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Statistical Methods and Inference · Statistical and numerical algorithms
