Reproducibility of machine learning analyses of 21 cm reionization maps
Kimeel Sooknunan, Emma Chapman, Luke Conaboy, Daniel Mortlock and, Jonathan Pritchard

TL;DR
This paper investigates the reproducibility and generalization of machine learning models, specifically CNNs, used for analyzing 21 cm reionization maps, revealing limitations in their ability to generalize across different simulations.
Contribution
The study reproduces existing CNN models for 21 cm map analysis and demonstrates their tendency to learn simulation-specific features, highlighting challenges in applying ML to real observational data.
Findings
CNNs often learn features of individual simulations rather than physics
Networks fail to generalize well to unseen simulations
Performance depends on specific case study factors
Abstract
Machine learning (ML) methods have become popular for parameter inference in cosmology, although their reliance on specific training data can cause difficulties when applied across different data sets. By reproducing and testing networks previously used in the field, and applied to 21cmFast and Simfast21 simulations, we show that convolutional neural networks (CNNs) often learn to identify features of individual simulation boxes rather than the underlying physics, limiting their applicability to real observations. We examine the prediction of the neutral fraction and astrophysical parameters from 21 cm maps and find that networks typically fail to generalise to unseen simulations. We explore a number of case studies to highlight factors that improve or degrade network performance. These results emphasise the responsibility on users to ensure ML models are applied correctly in 21 cm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · AI in cancer detection · Advanced Image and Video Retrieval Techniques
