Human vs. machine -- 1:3. Joint analysis of classical and ML-based summary statistics of the Lyman-$\alpha$ forest

S. Chang; P. Nayak; M. Walther; D. Gruen

arXiv:2508.03264·astro-ph.CO·August 6, 2025

Human vs. machine -- 1:3. Joint analysis of classical and ML-based summary statistics of the Lyman-$\alpha$ forest

S. Chang, P. Nayak, M. Walther, D. Gruen

PDF

TL;DR

This paper compares traditional and machine learning-based summary statistics for Lyman-alpha forest data, showing ML summaries capture nearly all traditional info and significantly improve parameter constraints.

Contribution

It demonstrates that ML-based summaries contain most traditional information and offer substantially tighter constraints on intergalactic medium parameters.

Findings

01

ML summaries nearly encompass traditional statistics' information

02

ML summaries improve parameter constraints by over a factor of 3

03

Combining summaries enhances the figure of merit significantly

Abstract

In order to compress and more easily interpret Lyman- $α$ forest (Ly $α$ F) datasets, summary statistics, e.g. the power spectrum, are commonly used. However, such summaries unavoidably lose some information, weakening the constraining power on parameters of interest. Recently, machine learning (ML)-based summary approaches have been proposed as an alternative to human-defined statistical measures. This raises a question: can ML-based summaries contain the full information captured by traditional statistics, and vice versa? In this study, we apply three human-defined techniques and one ML-based approach to summarize mock Ly $α$ F data from hydrodynamical simulations and infer two thermal parameters of the intergalactic medium, assuming a power-law temperature-density relation. We introduce a metric for measuring the improvement in the figure of merit when combining two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.