Modeling the galaxy-halo connection with machine learning
Ana Maria Delgado, Digvijay Wadekar, Boryana Hadzhiyska, Sownak Bose,, Lars Hernquist, Shirley Ho

TL;DR
This paper uses machine learning to model the complex galaxy-halo connection more accurately than traditional methods, improving galaxy clustering predictions for cosmology.
Contribution
It introduces a machine learning framework that captures high-dimensional dependencies in galaxy occupation, surpassing standard HOD models.
Findings
Machine learning models improve galaxy clustering predictions.
Secondary halo parameters significantly influence galaxy occupation.
Augmented models better match simulation data.
Abstract
To extract information from the clustering of galaxies on non-linear scales, we need to model the connection between galaxies and halos accurately and in a flexible manner. Standard halo occupation distribution (HOD) models make the assumption that the galaxy occupation in a halo is a function of only its mass, however, in reality, the occupation can depend on various other parameters including halo concentration, assembly history, environment, spin, etc. Using the IllustrisTNG hydrodynamic simulation as our target, we show that machine learning tools can be used to capture this high-dimensional dependence and provide more accurate galaxy occupation models. Specifically, we use a random forest regressor to identify which secondary halo parameters best model the galaxy-halo connection and symbolic regression to augment the standard HOD model with simple equations capturing the dependence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGalaxies: Formation, Evolution, Phenomena · Impact of Light on Environment and Health
