Mitigating Simulator Dependence in AI Parameter Inference for the Epoch of Reionization: The Importance of Simulation Diversity

Jasper Solt; Jonathan C. Pober; Stephen H. Bach

arXiv:2601.05229·astro-ph.CO·May 8, 2026

Mitigating Simulator Dependence in AI Parameter Inference for the Epoch of Reionization: The Importance of Simulation Diversity

Jasper Solt, Jonathan C. Pober, Stephen H. Bach

PDF

TL;DR

This paper demonstrates that training AI models on diverse EoR simulations enhances their ability to generalize across different simulators, reducing bias and improving parameter inference accuracy.

Contribution

The authors introduce a training strategy that leverages multiple simulation datasets to improve AI model robustness in EoR parameter inference.

Findings

01

Models trained on multiple simulators outperform single-simulator models on unseen data.

02

Increasing simulation diversity in training reduces bias from simulator-specific artifacts.

03

Multi-simulator training enhances the generalization capability of AI models for cosmological inference.

Abstract

The 21cm signal of neutral hydrogen contains a wealth of information about the poorly constrained era of cosmological history, the Epoch of Reionization (EoR). Recently, AI models trained on EoR simulations have gained significant attention as a powerful and flexible option for inferring parameters from 21cm observations. However, previous works show that AI models trained on data from one simulator fail to generalize to data from another, raising doubts about AI models' ability to accurately infer parameters from observation. We develop a new strategy for training AI models on cosmological simulations based on the principle that increasing the diversity of the training dataset improves model robustness by averaging out spurious and contradictory information. We train AI models on data from different combinations of four simulators, then compare the models' performance when predicting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.