TL;DR
This paper uses an ensemble of neural networks with probabilistic outputs to model the stochastic galaxy-halo connection in cosmological simulations, revealing key features and sources of scatter.
Contribution
It introduces a neural network ensemble with Gaussian loss to predict probability distributions, capturing the intrinsic scatter in the galaxy-halo connection.
Findings
Halo properties beyond mass explain up to 50% of the scatter in stellar mass.
Adding more halo features does not significantly improve predictions of galaxy size or gas mass.
Semi-analytic models assuming size-spin relations are not supported by the data.
Abstract
We apply machine learning, a powerful method for uncovering complex correlations in high-dimensional data, to the galaxy-halo connection of cosmological hydrodynamical simulations. The mapping between galaxy and halo variables is stochastic in the absence of perfect information, but conventional machine learning models are deterministic and hence cannot capture its intrinsic scatter. To overcome this limitation, we design an ensemble of neural networks with a Gaussian loss function that predict probability distributions, allowing us to model statistical uncertainties in the galaxy-halo connection as well as its best-fit trends. We extract a number of galaxy and halo variables from the Horizon-AGN and IllustrisTNG100-1 simulations and quantify the extent to which knowledge of some subset of one enables prediction of the other. This allows us to identify the key features of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
