Climbing Halo Merger Trees with TreeFrog

Pascal J. Elahi; Rhys J.J. Poulton; Rodrigo J. Tobar; Rodrigo Canas,; Claudia del P. Lagos; Chris Power; Aaron S.G. Robotham

arXiv:1902.01527·astro-ph.IM·August 15, 2019

Climbing Halo Merger Trees with TreeFrog

Pascal J. Elahi, Rhys J.J. Poulton, Rodrigo J. Tobar, Rodrigo Canas,, Claudia del P. Lagos, Chris Power, Aaron S.G. Robotham

PDF

TL;DR

TreeFrog is a high-performance, parallel halo merger tree builder that improves the accuracy of halo tracking across snapshots, reduces pathologies, and aligns well with established merger rate models.

Contribution

The paper introduces TreeFrog, a novel parallel code for constructing halo merger trees with improved matching algorithms and pathology corrections, enhancing accuracy over previous methods.

Findings

01

Reduces stranded halos by searching multiple snapshots.

02

Improves primary-secondary progenitor distinction with a new merit function.

03

Aligns measured merger rates with established models.

Abstract

We present TreeFrog, a massively parallel halo merger tree builder that is capable comparing different halo catalogues and producing halo merger trees. The code is written in c++11, use the MPI and OpenMP API's for parallelisation, and includes python tools to read/manipulate the data products produced. The code correlates binding energy sorted particle ID lists between halo catalogues, determining optimal descendant/progenitor matches using multiple snapshots, a merit function that maximises the number of shared particles using pseudo-radial moments, and a scheme for correcting halo merger tree pathologies. Focusing on VELOCIraptor catalogues for this work, we demonstrate how searching multiple snapshots spanning a dynamical time significantly reduces the number of stranded halos, those lacking a descendant or a progenitor, critically correcting poorly resolved halos. We present a new…

Figures28

Click any figure to enlarge with its caption.

Tables3

Table 1. Table 1 : Key TreeFrog parameters

	Name	Default Value	Comments
General Tree Options			Related to general tree construction.
	Tree_direction	1	Integer indicating direction in which to process snapshots and build the tree. Descendant [1], Progenitor [0], or Both [-1].
	Particle_type_to_use	-1	Particle types to use when calculating merits. All [-1], Gas [0], Dark Matter [1], Star [4].
	Default_values	1	Whether to use default cross matching & merit options when building the tree. 1/0 for True/False.
Merit Options			Related to calculation of merit function.
	Merit_type	6	Integer specifying merit function to use. Optimal descendant tree merit in Eq. (3) [6], common (progenitor tree) merit in Eq. (1) [1].
	Core_match_type	2	Integer flag indicating the type of core matching used. Off [0], core-to-all [1], core-to-all followed by core-to-core [2], core-to-core only [3].
	Particle_core_fraction	0.4	Fraction of particles to use when calculating merits. Assumes some meaningful rank ordering to input particle lists and uses the first $f_{TF}$ fraction.
	Particle_core_min_numpart	5	Minimum number of particles to use when calculating merit if core fraction matching enabled.
Temporal Linking Options			Related to how code searches for candidate links across multiple snapshots.
	Nsteps_search_new_links	1	Number of snapshots to search for links.
	Multistep_linking_criterion	3	Integer specifying the criteria used when deciding whether more snapshots should be searched for candidate links. Criteria depend on tree direction. Descendant Tree: continue searching if halo is: missing descendant [0]; missing descendant or descendant merit is low [1]; missing descendant or missing primary descendant [2]; missing a descendant, a primary descendant or primary descendant has poor merit [3]. Progenitor tree: [0,1].
	Merit_limit_continuing_search	0.025	Float specifying the merit limit a match must meed if using Multistep_linking_criterion=[1,3].

Table 2. Table 2 : Simulation parameters

Name	Box size	Number of	Particle Mass	Softening
		Particles		Length
	$L_{box}$ [ $h^{- 1} Mpc$ ]	$N_{p}$	$m_{p}$ [ $h^{- 1} M_{⊙}$ ]	$ϵ$ [ $h^{- 1} kpc$ ]
L40N512	$40$	$512^{3}$	$4.13 \times 10^{7}$	2.6
L210N1536	$210$	$1536^{3}$	$2.21 \times 10^{8}$	4.5

Table 3. Table 3 : TreeFrog configuration parameters

	Name	Default Value	Comments
Base Tree Construction Options			Related to basic operation.
	Tree_direction	1	Integer indicating direction in which to process snapshots and build the tree. Descendant [1], Progenitor [0], or Both [-1].
	Particle_type_to_use	-1	Particle types to use when calculating merits. All [-1], Gas [0], Dark Matter [1], Star [4].
	Default_values	-1	Whether to use default cross matching & merit options when building the tree. 1/0 for True/False.
Input/Output Options			Related to input/output formats.
	Input_tree_format	2	Integer flag indicating input halo catalogue format. ASCII SUSSING format (see Srisawat et al., 2013) [1], VELOCIraptor catalogues [2], ASCII nIFTy format, ASCII VOID catalogue format (see Sutter et al., 2014).
	VELOCIraptor_input_
	format	2	Integer flag indicating input format of VELOCIraptor catalogue. ASCII [0], Binary [1], HDF5 [2].
	VELOCIraptor_input_
	field_sep_files	0	Flag indicating whether halos and subhalos are written in separate files. All (sub)halos together [0], separate [1].
	VELOCIraptor_input_
	num_files_per_snap	1	If VELOCIraptor run in MPI mode, more than one file produced. Multiple files [1], one files [0].
	Output_format	2	Integer flag for output format. ASCII [0], HDF5 [2].
	Output_data_content	1	Integer flag for data contained in the output. BASIC (only descendant or progenitor IDs) [0], Standard (IDs plus merit) [1], Verbose (IDs, merit, and number of particles in structure).
Merit Options			Related to calculation of merit function.
	Merit_type	6	Integer specifying merit function to use. Optimal descendant tree merit in Eq. (3) [6], common (progenitor tree) merit in Eq. (1) [1].
	Core_match_type	2	Integer flag indicating the type of core matching used. Off [0], core-to-all [1], core-to-all followed by core-to-core [2], core-to-core only [3].
	Particle_core_fraction	0.4	Fraction of particles to use when calculating merits. Assumes some meaningful rank ordering to input particle lists and uses the first $f_{TF}$ fraction.
	Particle_core_min_
	numpart	5	Minimum number of particles to use when calculating merit if core fraction matching enabled.
	Shared_particle_signal_to_
	noise_limit	1	Mininum significance $σ_{N}$ of number of shared particles between object $i$ and $j$ , such that links with $N_{i ⋂ j} < σ_{N} \sqrt{N_{i}}, N_{i ⋂ j} < σ_{N} \sqrt{N_{j}}$ , that is where number of shared particle is below Poisson fluctuations, are removed.
Temporal Linking Options			Related to how code searches for candidate links across multiple snapshots.
	Nsteps_search_new_links	1	Number of snapshots to search for links.
	Multistep_linking_criterion	3	Integer specifying the criteria used when deciding whether more snapshots should be searched for candidate links. Criteria depend on tree direction. Descendant Tree: continue searching if halo is: missing descendant [0]; missing descendant or descendant merit is low [1]; missing descendant or missing primary descendant [2]; missing a descendant, a primary descendant or primary descendant has poor merit [3]. Progenitor tree: [0,1].
	Merit_limit_continuing_
	search	0.025	Float specifying the merit limit a match must meed if using Multistep_linking_criterion=[1,3].
	Temporal_merit_type	1	Integer specifying how merits at different times are weighted. Descendant Tree: Adjusts weights according to ranking and ignore temporal information for descendant trees [1], Adjust weights using ranking and temporal information [0]. Progenitor Tree: Adjust weights using temporal information [0].
	Merit_ratio_limit	4.0	For objects with secondary descendants but no primary descendant where secondary descendant’s primary progenitor also possibly primary progenitor of another object, maximum merit ratio between the secondary descendant’s primary progenitor and current object for which ranking is altered, leaving secondary now primary and previous primary now primary descendant of different object.
Additional Options
	Max_ID_Value	134217728	TreeFrog assumes particle IDs range from [0,MaxID] and uses this information for internal indexing. Set this value or invoke some form of mapping that maps input IDs to this form. Code will allocate memory of size MaxID to quickly access particle group ids.
	Mapping	0	Integer specifying the type of mapping to use on input particle IDs. No mapping [0], generate a id to index map (computationally intensive) [-1], or [1] a user defined mapping. If number of particles is large, suggestion is to invoke [-1]. This needs to only be done once and the code will save the map.
	Temporal_haloidval	1000000000000	For temporally unique halo IDs,
	HaloID_snapshot_offset	0	Offset applied to all temporal halo id values. Halo IDS have added to them (input snapshot number+HaloID_snapshot_offset)*Temporal_haloidval.
	HaloID_offset	0	Offset applied to all halo id values. Halo IDS are then input index+1+HaloID_offset.

Equations25

N_{A_{i} B_{j}} = N_{A_{i} ⋂ B_{j}}^{2} / (N_{A_{i}} N_{B_{j}}),

N_{A_{i} B_{j}} = N_{A_{i} ⋂ B_{j}}^{2} / (N_{A_{i}} N_{B_{j}}),

S_{A_{i} B_{j}, A_{i}} = l \sum N_{A_{i} ⋂ B_{j}} 1/ R_{l, A_{i}}

S_{A_{i} B_{j}, A_{i}} = l \sum N_{A_{i} ⋂ B_{j}} 1/ R_{l, A_{i}}

M_{A_{i} B_{j}} = N_{A_{i} B_{j}} \tilde{S}_{A_{i} B_{j}, A_{i}} \tilde{S}_{A_{i} B_{j}, B_{i}},

M_{A_{i} B_{j}} = N_{A_{i} B_{j}} \tilde{S}_{A_{i} B_{j}, A_{i}} \tilde{S}_{A_{i} B_{j}, B_{i}},

(x_{C_{t < t - 1}} - x_{B_{t}}) / R (V_{max})_{B_{t}},

(x_{C_{t < t - 1}} - x_{B_{t}}) / R (V_{max})_{B_{t}},

(v_{C_{t < t - 1}} - v_{B_{t}}) / V_{max, B_{t}},

N_{C_{t < t - 1}}

\frac{d N _{FOF, m}}{d z d ξ}

\frac{d N _{FOF, m}}{d z d ξ}

= A (\frac{M}{1 0 ^{12} M _{⊙}})^{α} (1 + z)^{η} {ξ^{β} exp [(\frac{ξ}{ξ ~})^{γ}]},

B C (p, q)

B C (p, q)

D_{B C} (p, q)

min i = 1 \sum m j = 1 \sum n f_{i, j} d_{i, j},

min i = 1 \sum m j = 1 \sum n f_{i, j} d_{i, j},

f_{i, j} \geq 0, 1 \leq i \leq m, 1 \leq j \leq n,

f_{i, j} \geq 0, 1 \leq i \leq m, 1 \leq j \leq n,

j = 1 \sum n f_{i, j} \leq w_{p i}, 1 \leq i \leq m,

i = 1 \sum m f_{i, j} \leq w_{q j}, 1 \leq j \leq n,

i = 1 \sum m j = 1 \sum n f_{i, j} = min {i = 1 \sum m w_{p i}, j = 1 \sum n w_{q j}} .

D_{E M D} = \frac{\sum _{i = 1}^{m} \sum _{j = 1}^{n} f _{i, j} d _{i, j}}{\sum _{i = 1}^{m} \sum _{j = 1}^{n} f _{i, j}} .

D_{E M D} = \frac{\sum _{i = 1}^{m} \sum _{j = 1}^{n} f _{i, j} d _{i, j}}{\sum _{i = 1}^{m} \sum _{j = 1}^{n} f _{i, j}} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\jid

PASA

\jyear2024

Climbing Halo Merger Trees with TreeFrog

Pascal J. Elahi1,2,†

Rhys J.J. Poulton1,2

Rodrigo J. Tobar1

Rodrigo Cañas1,2

Claudia del P. Lagos1,2

Chris Power1,2

Aaron S. G. Robotham1

1International Centre for Radio Astronomy Research, University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia

2ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D)

†Email: [email protected]

Abstract

We present TreeFrog, a massively parallel halo merger tree builder that is capable comparing different halo catalogues and producing halo merger trees. The code is written in c++11, use the MPI and OpenMP API’s for parallelisation, and includes python tools to read/manipulate the data products produced. The code correlates binding energy sorted particle ID lists between halo catalogues, determining optimal descendant/progenitor matches using multiple snapshots, a merit function that maximises the number of shared particles using pseudo-radial moments, and a scheme for correcting halo merger tree pathologies. Focusing on VELOCIraptor catalogues for this work, we demonstrate how searching multiple snapshots spanning a dynamical time significantly reduces the number of stranded halos, those lacking a descendant or a progenitor, critically correcting poorly resolved halos. We present a new merit function that improves the distinction between primary and secondary progenitors, reducing tree pathologies. We find FOF accretion rates and merger rates show similar mass ratio dependence. The model merger rates from Poole et al. (2017) agree with the measured net growth of halos through mergers.

doi:

10.1017/pas.2024.xxx

keywords:

methods: numerical – galaxies: evolution – galaxies: halos – dark matter

1 Introduction

Cosmological simulations underpin theoretical predictions of the formation and evolution of both galaxies and dark matter halos. Simulations containing billions of tracers are now common place, both N-body (e.g., Millennium, MultiDark, TIAMAT, SURFS Springel et al., 2005; Boylan-Kolchin et al., 2009; Klypin et al., 2016; Poole et al., 2016; Elahi et al., 2018) and full hydrodynamical simulations (e.g., EAGLE, ILLUSTRIS, Horizon-AGN Schaye et al., 2015; Vogelsberger et al., 2014; Dubois et al., 2014). These simulations are processed by sophisticated (sub)halo finders to identify dark matter (sub)halos (see Knebe et al., 2011a; Onions et al., 2012; Knebe et al., 2013, for a discussion of (sub)halo finding) and synthetic galaxies (e.g. Cañas et al., 2018). The evolution across cosmic time of galaxies and cosmic structure are reconstructed through the use of so-called “tree builders”. Following the mass accretion history of dark matter halos and producing “halo merger trees”, for instance, is pivotal in producing synthetic galaxy surveys with Semi-Analytic Models (SAM) of galaxy formation (e.g., Cole et al., 2000; Knebe et al., 2017; Lagos et al., 2018; Baugh et al., 2018, though SAMs can also use extended Press-Schetcher theory to produce Monte Carlo trees calibrated against simulations, e.g., Parkinson et al., 2008; Benson, 2017). The role of “Tree Builders” is to identify the optimal descendants and progenitors of halos/galaxies found at a given snapshot to later and previous snapshots respectively.

There are a variety of tree builders in use (e.g. Behroozi et al., 2013; Jiang et al., 2014; Poole et al., 2017), most of which perform similarly well, at least for reconstructing the accretion history of field halos (see Srisawat et al., 2013, for an overview of tree building). The problem of identifying optimal descendants/progenitors is compounded by imperfect (sub)halo finding as all (sub)halo finders can momentarily lose subhalos Avila et al. (2014). The cadence of the input halo catalogues can also have a severe impact on the resulting merger history, particularly when coupled with imperfect (sub)halo finding as (sub)halos can flicker in and out of existence Wang et al. (2016). The performance of merger trees can impact the synthetic galaxy population produced by SAMs Lee et al. (2014), though, in practice, only the satellite galaxy population is severely effect by flaws in halo merger trees. Srisawat et al. (2013) identified three features Tree Builders should employ in some fashion: the use of particle IDs to match objects between snapshots; using multiple snapshots to identify matches; and a method to smooth out any large mass fluctuations.

Here we present TreeFrog, a halo merger tree builder that employs the first two most critical features outlined in Srisawat et al. (2013)111Freely available https://github.com/pelahi/TreeFrog.git. Documentation is found at https://treefrog-halo-merger-tree-builder.readthedocs.io/en/latest/. When combined with state-of-the-art (sub)halo finders like VELOCIraptor Elahi et al. (2019), also minimises mass fluctuations of orbiting subhalos. The original, highly simplified TreeFrog algorithm was first briefly presented in Srisawat et al. (2013). Here we present significant updates and a full description of the code.

Our paper is organised as follows: in section §2, we outline the code package, present tests of our algorithm in §3 and conclude in §4 with a summary and discussion.

2 Following the evolution of structure with TreeFrog

TreeFrog at the most basic level is a particle correlator, matching particles present in one catalogue with those in another using particle IDs. It relies on particle IDs being continuous across time (or halo catalogues), so any particle type which has fluid IDs cannot be used to build a tree or cross-catalogue222An example would be the use of gas cells from AMR codes.. The basics of the particle correlator was first introduced in Srisawat et al. (2013). Here we present the software in full and significant updates to the original code used in Srisawat et al. (2013). Readers interested in tests and results can skip to §3.

The code can produce a simple cross catalogue or a full halo merger tree. When building a simple cross-catalogue, one reference catalogue is compared to another and all matches with high significance are returned. Full halo merger trees that try to capture the evolution of cosmic structure across cosmic time require extra care and can be constructed in two different fashion, either by walking backwards in time, a so-called progenitor based tree, or walking forwards in time, a so-called descendant based tree. A flow-chart of the code is presented in Fig. 1 and we describe the various aspects of our code below. For readers interested in input interfaces, output, and general modes of operation we suggest skipping to §2.3.

2.1 Merit Function & Optimal Matches

The first step in producing a cross comparison of halo catalogues or full halo merger trees is the cross matching of particles in (sub)halos. The cross-match between catalogues $A$ & $B$ is produced by identifying for each object in catalogue $A$ , the object in catalogue $B$ that maximises a merit function. Several merit functions are available. For a simple comparison between two halo catalogues, the merit is defined as:

[TABLE]

where $N_{A_{i}\bigcap B_{j}}$ is the number of particles shared between objects $i$ in catalogue $A$ and $j$ in catalogue $B$ , and $N_{A_{i}}$ & $N_{B_{j}}$ are the total number of particles in the respective objects. This merit function maximises the fraction of shared particles in both objects and is quite robust Knebe et al. (2011b).

However, there are instances where several possible candidates are identified. This is particularly problematic in multi-merger events that naturally arise when constructing halo merger trees. During similar mass mergers, loosely bound particles can be readily exchanged between halos.

We follow Poole et al. (2017) to alleviate these issues by using the rank of particles as ordered by their binding energy:

[TABLE]

Here the sum is over all shared particles and $\mathcal{R}_{l,A_{i}}$ is the rank of particle $l$ in halo $A_{i}$ , with the most bound particle in the halo having $\mathcal{R}=1$ . The maximum value, when all particles are shared, is $\mathcal{S}^{\rm max}_{A_{i}B_{j},A_{i}}=\gamma+\ln N_{A_{i}}$ , with $\gamma=0.5772156649$ being the Euler-Mascheroni constant.

This second merit requires input catalogues to be ordered according to binding energy333Technically, input catalogues need to be sorted in a physically meaningful fashion for the desired comparison. For halos, radial sorting is also reasonable. as TreeFrog does not calculate a ranking. VELOCIraptor natively has this ranking in place. Catalogues produced by other halo finders must be similarly sorted, otherwise it is strongly advised that one does not use this additional ranking merit in Eq. (2).

We combine Eq. (1) with the normalised version of Eq. (2), i.e $\tilde{\mathcal{S}}_{A_{i}B_{j},A_{i}}=\mathcal{S}_{A_{i}B_{j},A_{i}}/\mathcal{S}^{\rm max}_{A_{i}B_{j},A_{i}}$ , to obtain

[TABLE]

where we calculate the rank ordering for both halos in question since this ordering can be quite different. This combined merit maximises the total shared number of particles while also weighting the match by the number of equally well bound shared particles.

Finally, not all particles need be used to calculate merits, particularly if the input catalogues are sorted in a physically meaningful way, such as binding energy. One can limit the merit to a fraction $f_{\rm TF}$ of these particles. Limiting the comparison to the most bound particles can be key to correctly following major mergers.

2.2 Halo Merger Trees

Halo merger trees are more than a simple cross comparison of halo catalogues. They follow the evolution of halos, the mass accretion history, tidal disruption and interaction with other halos, ideally following the formation of a halo across cosmic time till either the present day or the point at which the object is tidally disrupted as it falls into a larger halo. Before discussing how trees are constructed, it is important to lay out some terminology.

•

A progenitor/descendant is a (sub)halo present at a previous/later time that points to a halo present at a later/earlier time as being its descendant/progenitor.

•

A primary progenitor/descendant is where a (sub)halo’s descendant/progenitor points back to it as its progenitor/descendant. Note that due to issues with the halo finding process and physical processes involving mass transfer, it is possible for objects to have more than a single candidate descendant. We discuss this in more detail later.

•

A main branch in a tree is one which traces the evolution of a halo from its first progenitor to its final descendant, moving forwards/backwards via the object’s primary descendant/primary progenitor links.

•

A secondary progenitor is where a (sub)halo has merged with the main branch of another halo and ceases to exist as an independent object. This defines sub-branches of the main branch and is a natural consequence of structure formation which we discuss in more detail later.

•

A secondary descendant is where a (sub)halo has identified multiple possible links. All links that have lower merits are secondary matches. Such matches are a natural consequence of mass loss, where an object falls into another object and still has a surviving primary descendant but some of its mass has been associated with the accreting object, generating a low merit link.

•

The first (root) progenitor of a branch is the object that has no progenitors. The final (root) descendant of a branch is the object that has no descendants, which typically occurs at the last snapshot.

Building trees is complicated by the imperfect (sub)halo finding process (see Srisawat et al., 2013; Behroozi et al., 2013; Avila et al., 2014; Wang et al., 2016; Poole et al., 2017, for discussions of the pitfalls of tree building). Halo finders can artificially merge halos at a given snapshot and later separate them, generating missing links in the tree. This problem is particularly acute for low mass halos that lie near the particle number threshold used by the halo finder or for subhalos close to the centre of their host halo. Stranded (sub)halos lacking a progenitor are less of a problem with VELOCIraptor, the (sub)halo finder used in this study, but no (sub)halo finder is completely immune. Critically, such events can occur at much higher masses. With fine-scale temporal resolution, the merger tree will have halos popping in and out of existence.

This can lead to several crucial issues in the resulting tree:

•

Truncation: where the (sub)halo finder cannot find an object for one or more snapshots, leading to premature disruption of the object, and possibly leaving a large object identified at later times with no progenitor (leading to the appearance of halo fragmentation).

•

Flip-flopping: where links between two objects are swapped at one snapshot but corrected in subsequent snapshot(s), leading to large changes in the object’s properties in the snapshot where it happens.

•

Branch swapping: which is similar to flip-flopping, except the tree builder does not correct it and so the objects continue their independent evolution, leaving a single point with a sharp change in properties.

All trees will suffer from these issues, the degree to which they do depending in part on the (sub)halo finder. A variety of methods have been used in literature to handle this cases, some simple (e.g. Fakhouri & Ma, 2008; Genel et al., 2010, which are specific for simple FOF merger trees), some more complex (e.g. Behroozi et al., 2013; Poole et al., 2017). TreeFrog uses several techniques to minimise the occurrence of these issues, the most critical one being that it searches multiple snapshots for candidate links. We discuss the specific extra steps taken when producing progenitor and descendant based tree.

Progenitor Based Tree:

The input catalogue is processed by comparing objects (both halos and subhalos) found at a snapshot to those in preceding snapshots, moving backwards in time. We start by linking a snapshot with the one immediately preceding it, identifying matches for all objects. Objects are allowed to have multiple progenitors but no object by construction will have multiple descendants. We rank temporal links such that the primary progenitor of an object is the one which maximises the merit looking backwards in time. If two objects share the same progenitor, the one with the lower merit has the link removed. If an object has a poor merit, typically below $\mathcal{M}_{\rm lim}\sim 0.05$ , the link is removed. The remaining, highest merit link is deemed a primary progenitor/descendant link. For objects with no progenitor, earlier snapshots are searched until a viable progenitor is found, up to a maximum number of snapshots $\Delta_{s}$ from the current one.

Descendant Based Tree:

The input catalogue is processed by comparing objects (both halos and subhalos) found at a snapshot to those found at later snapshots, moving forwards in time. We start by linking a snapshot with the one immediately following it, identifying matches for all objects. In the absence of tidal disruption, objects should have a single descendant. However, mass loss and tidal disruption are natural processes that complicated tree building, producing links to several candidate descendants. Consequently, we allow objects to have multiple candidate descendants. Candidate descendants are split into two categories: primary and secondary links. A primary descendant link is one where a halo’s best candidate descendant, that is the one with highest merit amongst the object’s matches, also ranks the halo as the best amongst all its matches going backward in time. All other connections are classified as secondary links. Secondary links arise from the physical tidal disruption and merging processes as well as unphysical merging of halos where the halo finder fails to identify an object. TreeFrog does not attempt to differentiate between these processes. If an object does not have a primary descendant, subsequent snapshots are searched till a primary is identified or the maximum number of snapshots to be searched, $\Delta_{s}$ , has been reached.

Once an initial tree has been constructed, TreeFrog attempts to correct the tree for truncation events (and the associated branch swapping that may result from them) that arise from the loss of the object by the (sub)halo finder. Objects that lack a primary progenitor are corrected for as follows. For an object $A_{t}$ that does not have a primary progenitor, we examine the best ranked secondary progenitor, $B_{t-1}$ and this secondary progenitor’s primary descendant, $B_{t}$ , if such an object exists. If this object $B_{t}$ has a secondary progenitor $C_{t-1}$ which itself has no primary descendant and has a merit $\mathcal{M}_{C_{t-1}B_{t}}$ within a factor of $f_{\mathcal{M}}\sim 0.5$ of $\mathcal{M}_{B_{t-1}B_{t}}$ and above $\mathcal{M}_{\rm lim}$ , we adjust the links so that instead of $(B_{t-1}\rightarrow B_{t},C_{t-1}\rightarrow\varnothing,\varnothing\rightarrow A_{t})$ , we have $(C_{t-1}\rightarrow B_{t},B_{t-1}\rightarrow A_{t})$ . A schematic of this branch fix is shown in the top panel of Fig. 2.

We also apply further corrections to objects with no primary progenitor as a post-processing step that relies on using the full history of an object, specifically the final descendant of a main branch. We identify objects that do not have primary progenitors but have secondary progenitors. Specifically, for an object $A_{t}$ , we examine its best ranked secondary progenitor, $B_{t-1}$ , and that object’s best descendant ${B_{t}}$ . If both objects end up with the same final descendant, it is possible progenitors have been incorrectly assigned due to artificial merging of objects at $t-1$ . Thus we then search for an object $C_{t_{i}<t-1}$ that has a primary descendant after the merger at $t-1$ , $C_{t_{i}>t}$ , that belongs to the same final descendant and that has a similar phase-space position and number of particles as $B_{t}$ . Specifically, we require:

[TABLE]

Here ${\bf x}$ & ${\bf v}$ are the positions & velocities, $R(V_{\rm max})$ & $V_{\rm max}$ , are the maximum circular velocity radius & is the maximum circular velocity and $N$ is the number of particles belonging to the object. This phase-space check is a simplified form of halo tracking. We also limit this patching to well resolved objects, that is those composed of $\beta_{N_{\rm lim}}N_{\rm lim}$ where $N_{\rm lim}$ is the particle limit used by the (sub)halo finder, $\beta_{N_{\rm lim}}\gtrsim 2$ . Full gravitational evolution and unbinding, particularly for poorly resolved objects, is best done using halo tracking tools (like WhereWolf, Poulton et al., in prep). If this match meets these criteria, we then correct the branches so that instead of $(B_{t-1}\rightarrow B_{t},C_{<t-1}\rightarrow C_{>t},\varnothing\rightarrow A_{t})$ , we have $(B_{t-1}\rightarrow A_{t},C_{<t-1}\rightarrow B_{t}\rightarrow C_{>t})$ . A schematic of this branch fix is shown in the bottom panel of Fig. 2. The parameters $\alpha$ are order unity and we have found $(\alpha_{R(V_{\rm max})},\alpha_{V_{\rm max}},\alpha_{N})=(2.0,1.0,0.05)$ corrects most events.

2.3 Code Structure

TreeFrog is a c++ code that uses OpenMP+MPI APIs for parallelisation but can be compiled in serial mode, solely with OpenMP, and solely with MPI. The code requires an input file containing a list of halo catalogue file names, the number of snapshots to process and an output file name.

The main input file is a simple text file that lists the locations of the halo catalogues ordered in increasing time. These halo catalogues can be in native VELOCIraptor output (ASCII, Binary, and HDF5444Self-describing binary format, library found at https://www.hdfgroup.org/) or in a simple ASCII format that was used in the SUSSING Merger Trees workshop (see Srisawat et al., 2013). Other input formats can be implemented and the input data must contain a list of particle IDs and possibly particle types for each halo in the halo catalogue.

MPI domain decomposition is temporal in nature, with each thread loading the halos from an entire simulation volume for a certain number of snapshots. Snapshots are distributed to different threads by load balancing the memory footprint on each MPI process. Specifically, snapshots are split to ensure that each thread loads either roughly the same number of total halos or the same total number of particle ids (depending on runtime configuration). An MPI thread loads a minimum of $2\Delta_{s}+1$ and each MPI domain must overlap the neighbouring domains by $\Delta_{s}$ so as to have a complete list of connections to and from the snapshots localised to a single mpi domain.

Although there are a variety of modes that TreeFrog can be operated in, there are three principal ones. TreeFrog can be used to produce a Descendant Tree, Progenitor Tree or simply cross correlate two catalogues. It produces the following types of output formats: ASCII; HDF5 (preferred). The output file(s) consists of a list of a halo, the number of descendant/progenitor/cross matches and the ID of these linked objects, for all halos identified at a given snapshot. In the ASCII format, this is combined into a single continuous file, whereas in the preferred HDF5 format, each snapshot is written separately. An ADIOS interface will be included and will have the same naming convention as the HDF5 output.

Post-processing of the full tree information is done using python scripts to produce a simple, walkable tree where each (sub)halo will have links to their immediate progenitor, immediate descendant, first progenitor and final descendant (that is eliminating all secondary links and merit information), the typical information need by SAMs. This post-processing removes secondary links.

Options can be passed either via command line or through a text file. We list the configuration options that can be passed via this input text file in Table 1.

3 Results

Here we present how well trees are built. As input we primarily use a small cosmological N-Body simulation consisting of $512^{3}$ particles (from the SURFS suite Elahi et al., 2018). Simulation details are presented in Table 2.

We focus on trees produced using two input halo catalogs produced using VELOCIraptor: a simple 3DFOF (3D configuration space Friends-of-friends) catalogue that does not contain subhalos; a 6DFOF (phase-space) halo catalogue; and a full (sub)halo catalogue using fiducial parameters for VELOCIraptor. Details of how VELOCIraptor identifies (sub)halos are presented in Elahi et al. (2019). Here we summarise: the code is a phase-space (sub)halo finder that identifies structures in a two-step process: it identifies field halos using either a 3DFOF algorithm or a 6DFOF algorithm; and then identifies substructures for each halo by linking dynamically distinct particles using a phase-space FOF algorithm and searching for major merger remnants using an iterative 6DFOF.

We show examples from our 3 halo catalogs in Fig. 3. The 3DFOF halo extracted from our L40N512 simulation at $z=0$ is composed of $\approx 10^{6}$ particles with a FOF mass of $4.2\times 10^{14}h^{-1}{\rm M}_{\odot}$ and an overdensity virial mass of $M_{\Delta\rho_{c}}=2.7\times 10^{14}h^{-1}{\rm M}_{\odot}$ , where $M_{\Delta\rho_{c}}=4\pi\Delta\rho_{c}R_{\Delta\rho_{c}}/3$ , $\rho_{c}$ is the critical density, and $R_{\Delta\rho_{c}}$ is the radius enclosing an average density of $\Delta\rho_{c}$ , where $\Delta=200$ , commonly referred to as the virial mass. This 3DFOF object was identified using the standard 3DFOF linking length of $\ell_{x}=0.2L_{\rm box}/N_{p}$ , where $L_{\rm box}/N_{p}$ is the inter-particle spacing. This 3DFOF halo consists of several large density peaks, some of which lie outside the virial radius centred on the largest density peak. The 6DFOF halo is the largest object of the initial 3DFOF candidate and the density peaks that were outside the virial radius of the 3DFOF are now considered separate 6DFOF halos. The 6DFOF halo contains at least 4 large density peaks and numerous smaller ones, speaking to a rich merger history, with several major mergers in the recent past and likely several mergers in the near future. At $z=0$ , this object contains 222 substructures (including the host halo), 3 of which contain $21\%$ of the initial host 6DFOF halo’s mass. We refer to this halo as our fiducial case as this object has a complex merger history, undergoing a quintuple merger.

The trees are built using 200 snapshots spaced evenly in $\log a$ , where $a$ is the scale factor, starting at $a_{i}=0.04$ and ending at $a_{f}=1$ . The cadence is such that at late times the temporal spacing between snapshots is $\sim 250$ Myr. We produce several trees for each halo catalogue, varying the merit function and the number of snapshots searched for links. We focus on the descendant based tree as walking forwards in time provides a natural method to correct trees, namely that an object should have a primary descendant. If an object lacks a primary descendant (or any viable descendant), further snapshots can be searched till a primary descendant is found. In contrast, progenitor trees are only corrected for objects that lack a progenitor.

For this analysis, we also make use of Merger Tree Dendogram Poulton et al. (2018). These dendograms capture the mass accretion history of (sub)halos and their orbital evolution. Ideally the mass accretion history of a halo should be smooth, increasing with time, whereas subhalos should slowly shrink, losing most the their mass near peri-centric passage.

3.1 Individual Halo

We examine the reconstructed merger history of the fiducial halo presented in Fig. 3 in our three halo catalogues using dendograms in figures 4-7. These figures present the mass and orbital evolution of branches of a tree and any objects the main branch may have interacted with (see Figure 4 from Poulton et al. (2018) which describes in detail the information the dendogram tries to capture). We start with the simplest halo catalogue, the 3DFOF one and build a descendant tree using a single snapshot and a simple merit function to identify links and proceed to add corrections to the tree and complexity to the input (sub)halo catalogue.

We present in Fig. 4 the dendogram from the tree built on a simple FOF catalogue with the simplest merit function, Eq. (1), using all particles to calculate merits. This dendogram shows the mass accretion history and motion relative to the first progenitor of the main branch, along with the relative radial position of a sample of large sub-branches and interacting branches. We also show the merit between matches via the colour and highlight the mass accretion history of the 4 largest branches in the inset.

The figure shows that the FOF halo has a simple mass accretion history. The main branch halo continuously grows in mass, absorbing smaller halos (sub-branches). There are several kinks it the main branch’s motion. These occur during major mergers, where the centre-of-mass can shift, moving to the density peak corresponding to the other halo. The merit of the main branch remains close to unity till the last snapshot, where by construction $\mathcal{M}=0$ as there are no descendants. Using the more complex ranking merit scheme given by Eq. (3) leaves the tree generally unchanged as the input catalogue is simple.

The sub-branches here merge well outside the main branch’s virial radius, a natural outcome of a 3DFOF catalogue. The sub-branches typically have merits close to unity, till they merge with the main branch, where the merit becomes very low (typically $\lesssim 10^{-2}$ , the exact value depending on the merit function). The sub-branch mass evolution is generally smooth, although at least two sub-branches, number 8 & 9 are truncated. These objects are actually descendants of sub-branch 7. This object leaves enters and momentarily re-emerges the FOF envelop of the main branch twice. Since we do not allow for halo fragmentation, these objects are left stranded in the tree. These truncation events can be fixed by searching for descendants across several snapshots and using the full merit function given in Eq. (3). Using all particles to calculate the simple merit in Eq. (1) incorrectly links these stranded halos to small halos with a very low merit, i.e., branch swapping events.

Using a 6DFOF input catalogue corrects some of the issues present in the original tree, indicating how the performance of a tree depends on the input catalogue as seen in Fig. 5. Objects now merge later, there are fewer truncated large sub-branches but flip-flopping events are still present. There are instances of the large halos linking to small halos, giving rise to the significant drops in mass (see sub-branch 2). Despite a few sub-branches behaving poorly, 3DFOF/6DFOF trees are relatively stable, with the critical issue lying with the misleading physics implied by this tree. Objects should persist till well inside the virial radius. This either requires tracking of FOF particles using codes like hbt+ Han et al. (2018) or identifying substructure555The extra complexity introduced by substructure (both in identifying it and determining optimal branches) and the relative simplicity of 3DFOF mergers trees is the motivation behind codes like hbt+, which takes as input the particles in 3DFOF halos and parses them through an unbinding routine to follow their evolution, building a substructure hierarchy and a halo merger tree at the same time. The drawback is that the input 3DFOF catalogue must have high enough cadence to capture the formation of FOF halos and the code requires full particle information across cosmic time..

Figure 6 shows how adding the identification of substructure significantly complicates the process of tree building. This dendogram now contains interacting branches, aka subhalos, as well as the main branch and sub-branches. We can see objects merging well within the virial radius of the main halo. The obvious issues in this halo’s reconstructed history are: the main branch starts abruptly and there are several large objects left stranded in the tree with no progenitor. In some cases, the truncation arises from the fact that the halo finder loses track of an object for at least one snapshot. A more subtle issue present is the change in the motion of the main branch. The distance plotted here is the relative comoving distance from the position of the first progenitor. A change in the motion is suggestive of a branch swapping event earlier in the object’s history.

The typical cause of these issues are major mergers. The basic assumption underlying TreeFrog and many tree builders is that particles orbit an individual object and thus can be used to trace the evolution of object. Most of the time a halo grows by the smooth accretion of material or the tidal disruption of smaller objects and the vast majority of particles in the environment principally orbit the potential well defining the main halo. However, the orbits of particles during major mergers are complex. Some particles are ejected from the system altogether, some have orbits that swap the potential well they are orbiting and others orbit both potential wells. The fraction of particles quickly evolving orbits steadily increase with time, starting with the loosely bound particles and progressing to increasingly bound particles as the objects coalesce. Therefore, using all the particles can give rise to flip-flopping, branch swapping and even truncation, clearly seen in Fig. 6.

These problems are fixed by:

•

Searching for links across multiple snapshots.

•

Using the most bound particles or ranking particles by how well bound they are to determine the quality of the match.

•

Correcting objects with no primary progenitors.

Figure 7 shows the resulting dendogram once multiple snapshots are searched, here 4 snapshots corresponding to $\sim 1$ Gyr or $\sim 1$ free-fall dynamical time, $\tau\propto\sqrt{3\pi/32G\rho},\rho=200\rho_{c}$ . We also use a fraction of the most bound particles are used, here $0.4$ to calculate merits666The exact fraction depends on the (sub)halo finder used. Configuration-space (sub)halo finders will artificially shrink large subhalos as they fall to the centre, whereas phase-space finders might overestimate the mass assigned to the infalling object as the object is dynamically heated. For phase-space finders like VELOCIraptor, we find using $\lesssim 50\%$ minimises branch swapping events., and we correct for missing progenitors/branch swapping events across multiple snapshots. Large subhalos that previously sprang into existence inside the virial radius now are connected. The motion of the main branch is now in better agreement with the original main branch motion seen in Fig. 4.

3.2 Tree statistics

We now turn to the overall statistics of the tree. Ideally, objects form when composed of few particles and once formed should always have a descendant. Yet poorly resolved objects can evaporate and be left without a descendant, particularly if the time between snapshots is short. We examine the statistics of when objects form and the fraction of objects without a descendant in figures 8 & 9 respectively, focusing on the 3DFOF tree, and the 6DFOF+substructure tree using a single snapshot to identify links and 6DFOF+substructure tree built using 4 snapshots. We argue that these statistics are more informative than the common practice to examine branch lengths (e.g. Srisawat et al., 2013), i.e., the number of snapshots a main branch exists for, as the length of a main branch depends sensitively on the cadence used in producing the tree.

Clearly the median formation particle number in Fig. 8 is very close to the 20 particle limit used to identify structures for all three trees, showing little evolution with $50\%$ of all newly formed objects being composed of $\leq 25$ particles. The upper $84\%$ quantiles do show some dependence on cosmic time, increasing from $25$ particles up to close to twice the 20 particle limit. This increase in $N_{p}(z_{\rm form})$ with cosmic time partly due to the larger physical time between snapshots at late times which allows halos that lie below the 20 particle threshold to grow more. However, the fact that both upper quantiles decrease when using more snapshots to identify links in the tree (going from 6DFOF.SUBS.t1 to 6DFOF.SUB.t4) indicates that this is not the sole reason. Mergers between poorly resolved halos can cause breaks in the tree as one of the halos is lost for one (or more) snapshots before re-emerging, the result being a halo with an artificially inflated $N_{p}(z_{\rm form})$ . The reduction in the upper $97.5\%$ quantile from objects composed of $\gtrsim 100$ particles to $\sim 80$ when going from using a single snapshot to using 4 snapshots is clearly evidence of truncation. Using multiple snapshots ensures that vast majority of objects form close to the particle limit and even the largest, newly formed objects in the tree are composed of $\lesssim 100$ particles at all times.

In figure 9, we see that the fraction of objects lacking descendants remains roughly constant across most of cosmic time for all trees, decreasing slightly at earlier times. We note that for objects composed of twice the particle limit used in the halo catalogue, this fraction is small $\lesssim 10^{-3}$ . As we go from the easier problem of following FOF halos to following 6DFOF halos and their substructure we find a slight increase in the fraction if a single snapshot is used. In general, the fraction with no descendants is small and this population is dominated by poorly resolved objects, with $99\%$ composed of $\lesssim 40$ particles. Approximately $10\%$ of very poorly resolved objects composed of $<40$ particles have no descendants for trees built using a single snapshot, with the number increasing slightly when using 6DFOF halos.

Searching multiple snapshots reduces this fraction, dropping it by a factor of 2 for objects composed of $\geq 40$ particles, and reducing the amount for objects composed of $<40$ particles to a more reasonable $2\%$ . This fraction increases by a factor of 4 in the last 4 snapshots where the number of snapshots used to correct the tree begin to drop, rising from $5\times 10^{-4}$ to $2\times 10{-3}$ . This increase demonstrates the need for searching multiple snapshots and running simulations past the last desired redshift to correct the catalogue at these late times, though the exact number of snapshots depends on the cadence of the input catalogue. For most snapshots, roughly $4\%$ of objects composed of $\geq 40$ particles have descendants found more than a single snapshot in the future.

A tree should be constructed so as to have a clear distinction between the main branch and sub-branches. Several merit functions are in common use to rank matches, separating primary descendant/progenitor links that define the main branch and secondary descendant/progenitor links that define sub-branches which merge with the main branch (see Srisawat et al., 2013, for a sample). The most common merit function maximises the number of shared particles in some form, sometimes using all particles (see for instance Knollmann & Knebe, 2009), sometimes using only the most bound set of particles (see for instance Jiang et al., 2014). Poole et al. (2017) argued for a merit function that used the particles self-binding ranking (Eq. 2). TreeFrog can use several merit functions, which we show the results of in Fig. 10.

This figure shows that $50\%$ of primary descendants, the merits are close to 1, regardless of the type of merit function used. For our fiducial merit function, which is a combination of using some fraction of the most bound particles and Eq. (3), we find primary descendant merits of $\sim 0.75\pm 0.1$ , showing little evolution. Only $\sim 1\%$ of the population has $\mathcal{M}\lesssim 0.2$ and then only at late times. Secondary descendant merits on the other hand are on average $\lesssim 10^{-1}$ and evolve strongly with time, a consequence of large, well resolved halos accreting small (sub)halos through natural bottom-up growth. The ever increasing mass ratios probed at late times causes the median secondary merit to drop.

Comparing to other merit functions, we find using all particles increases primary merits, and removing the ranking merit, that is using Eq. (1), increases the primary merit further. However, this increase in primary merit values is counterbalanced by a similar increase in secondary merits. Simply using Eq. (1) increases the median secondary merits by a factor of $\sim 3$ relative to the default merit function. Critically, the separation between primaries and secondaries is largest using the default scheme. For the fiducial merit, the distribution in merits only overlaps at the $2\sigma$ level at late times. Using the shared number of particles increases the overlap in the population from $\sim 3\%$ to $\sim 7\%$ (for a more formal comparison of the distances between primaries and secondaries see §A). These changes argues in favour of using Eq. (3) over Eq. (1).

3.3 Mergers

We examine the details of when sub-branches merge with the main branch here. The expectation is that when (sub)halos eventually merge with other (sub)halos as a sub-branch, this should occur well within the virial radius of the accreting (sub)halo. The merger statistics of these trees is shown in Fig. 11. Here we have for every (sub)halo across cosmic time identified secondary descendants and noted the relative radial distance the primary and secondary descendant.

As expected, trees constructed from 3DFOF catalogues have a majority of objects merging outside the virial radius. The overall distribution of mergers is not only skewed to large radii but larger objects merging at larger radii, which is unphysical as these objects should be less prone to tidal disruption. Of course, this is a natural consequence of using a 3DFOF halo catalogue but is useful for showing an extreme case.

The merger statistics of the trees built using a full halo+subhalo catalogue show that even for a single snapshot linking most objects merge inside the virial radius and critically, large objects are found to merge well inside accreting halos. Interestingly, using more snapshots to identify descendants shifts the median radius at which objects merge slightly to larger radii. This is a consequence poorly resolved halos at the outskirts of larger halos that can flicker in and out of the halo catalogue as they drop below/move above the particle limit used in the (sub)halo finder. Without multiple snapshots searched, these objects are left with no descendants (primary or secondary). Searching multiple snapshots links these objects to the larger halo as secondary descendants.

An interesting feature in this figure is the presence of large objects that merge at small radii. Naively, the ideal scenario is for objects to merge when composed of few particles deep within another object. However, a natural consequence of major mergers is that objects can phase-mix while still relatively intact. For mini mergers, where the accreted object is far less massive than the accreting object, we expect the smaller object to orbit several times as it is tidally stripped, shrinking to the point at which it becomes completely tidally disrupted. Thus, we should see objects with large accretion masses relative to their accreting host merging at smaller radii, possibly with large masses, compared to mini mergers with very small mass ratios. We note that by accretion, we mean the point at which the object enters the FOF envelope of another halo.

To explore this dichotomy, we take a random sample of accretion events that fully merge before $z=0$ with halos composed of $\geq 10^{5}$ particles, splitting the population by the accretion mass ratio. We split objects based on accretion mass ratios into those with ratios of $\leq 10^{-2}$ (mini mergers) and those with $\geq 5\times 10^{-2}$ (containing both minor to major mergers), although the precise split is not critical. The host halo limit and ratio cuts means that minor/major mergers are composed of $\gtrsim 5000$ particles at accretion. We should stress that this sample is biased as we are focusing on objects that were accreted and then merger with the host within $\lesssim 3$ Gyr. Many objects do not merge with their host halo within this time and are not present in this figure, consequently the mass loss rate of this population is higher than the full population. For each merged object we determine the average mass change from one snapshot to the next since accretion till it mergers. We plot the total population and the median values of accretion mass, particle number at merger, merger radius and mass change in Fig. 12, which has two key features.

The first feature of note is that on average, so-called minor/major mergers fully merge at smaller radii than mini mergers (objects with very small accretion mass ratios). The former events typically merge inside the scale radius of the host halo (see median values), latter outside. The merger radius does not show a dependence on particle number at accretion for either population.

Second, on average mini mergers steadily lose mass as they orbit, on average losing $\sim 75\%$ of their mass every 250 Myr (as indicated by the colour). The average fractional mass change does show some dependence on the accretion mass ratio of mini mergers, with small mini mergers ( $N_{p,{\rm acc}}/N_{\rm H}(t_{\rm acc})\lesssim 10^{-3}$ ) show little average change in mass till merging. In contrast, objects with large accretion mass ratios typically do not steadily lose mass once accreted. Instead, they rapidly phase-mix when they enter the central regions of the host halo and typically remain in the central regions due to dynamical friction (Han et al., 2018, also finds large subhalos are “trapped” in the central regions of their host due to dynamical friction using hbt+, a 3DFOF tracker). These objects can fully merge with the host halo while still close to their accretion mass, such as the sub-branch seen in Fig. 7. Others are last identified when composed of a few hundred particles having been accreted when composed of several thousand particles, with most of the mass loss occurring in the last step at which the object was identified.

The physical imprint of accretion mass on the dynamics of the merger is reproduced by the tree, though recovering this bimodal distribution requires a phase-space finder that does not artificially shrink halos as they fall inwards. The high mass loss rates of minor/major merger as they phase-mix means that trees should cross match only the most bound particles and ranking particles according to binding energy in order to recover the last inspiral, although using too few particles can give rising to core swapping, were a small subhalo takes over the branch.

3.4 Accretion and Merger rates

We end by examining the merger rates and mass growth via mergers. We calculate the “mean merger rate per halo” expressed in terms of redshift, descendant mass and mass ratio between primary progenitors and secondary progenitors in Fig. 13. For every halo with mass $M_{D}$ having multiple progenitors, we determine the mass ratio between the primary progenitor and secondary progenitors, $\xi\equiv M_{P,s,{\rm tot}}/M_{P,p,{\rm tot}}$ , where we use the total exclusive mass associated with the object and bin in mass ratio bins, averaging over a redshift range of $z=0.5$ to [math]. We also calculate the “mean accretion rate of FOF halos per halo”, that is we identify all objects that are progenitors or substructures of a descendant halo which were FOF halos at the previous snapshot, that is we define accretion as a FOF halo entering the FOF envelop of another, more massive, FOF halo. This FOF halo can survive as a subhalo or can merge with the accreting FOF halo. This “FOF accretion rate” is analogous to the “FOF merger rate” reported in Fakhouri & Ma (2008); Fakhouri et al. (2010); Genel et al. (2010), who examined the rate at which FOF halos merge with one another. This is also analogous to the “corrected substructure merger rate” presented in Poole et al. (2017)777Both accretion and merger are used in the literature, sometimes describing the same physical process, which can be a bit confusing. Here we refer to mergers as instances where an object is found to be a secondary progenitor of another object later in time, that is the object has ceases to exist as an independent object, and is a secondary branch of another object that continues to exist. Accretion events are specifically when a FOF object enters the FOF envelop of another more massive object..

Fakhouri et al. (2010) showed that this FOF merger rate has a nearly universal dependence on $\xi$ , with little dependence on descendant mass and redshift, and is characterised by:

[TABLE]

where $B$ is the number of mergers per unit volume, redshift and mass ratio, $n$ is the number density of halos, and $A,\alpha,\eta,\beta,\tilde{\xi},\gamma$ are all fitting parameters. Fits show $\alpha,\eta$ are small, indicating a weak dependence on descendant mass and redshift.

We find that both merger and accretion rates have forms similar to the FOF merger rate of Fakhouri et al. (2010), a roughly power-law dependence on $\xi$ with a flattening at large mass ratios with little descendant mass dependence (as seen by the fact that the coloured lines overlap in the top subpanel). We also find little merger rate differences between our small volume L40N512 run and our larger volume, poorer mass resolution L210N1536 run, indicating a converged merger rate. The differences between our rates and the fits of Fakhouri et al. (2010) (and Poole et al., 2017) are that the rate is lower and the high mass ratio rate is flatter. The amplitude is closer to the Poole et al. (2017) “corrected substructure merger rate” fit, though this rate is still higher. The reduced rate is a result of defining halos as 6DFOF objects rather than 3DFOF objects 8883DFOF objects can artificially join halos by thin particle bridges, effectively pushing back the accretion time. Using a 6DFOF algorithm removes these particle bridges, moving the accretion time to the point at which the virialized envelops begin to dynamically overlap, increasing the number of halos, see Elahi et al. (2019). Modifying the “corrected substructure merger rate“ fit from Poole et al. (2017) to account for the larger number of 6DFOF objects moves this fit into better agreement with the measured accretion rate.

There is little difference between the measured merger and accretion rates despite the fact that a significant amount of time (and evolution) can elapse before an accreted object fully mergers. The mapping from FOF accretion rates to merger rates is non-trivial. Merger time scales depend on the orbit, the tidal mass loss rate, and the initial accretion mass ratio (subhalos with large ratios should experience little tidal mass loss). Subhalos can merge through a variety of channels, some artificial (lost by the (sub)halo finder, numerical evaporation) and some physical (tidal disruption, phase-mixing). At $\xi\approx 1$ , one might expect a simple delay between accretion and merging as large subhalos will quickly sink to the centre of the host due to dynamical friction, leaving the the functional form unchanged. As one transitions from the regime dominated by dynamical friction to tidal mass loss, near $\xi\sim 5\times 10^{-2}$ , the mapping from accretion to merging becomes more complex and we might expect a change in the functional form. This does appear to be the case, with the largest difference between these rates occurring between $10^{-2}\lesssim\xi\lesssim 10^{-1}$ . The

The merger rate indicates that halos, on average, experience more minor mergers than major ones but will acquire more mass during major mergers. Based on the modified fit, we expect $10^{12}h^{-1}{\rm M}_{\odot}$ halos at $z=0$ to have experienced $\sim 10$ minor merger events ( $\xi=[10^{-3},5\times 10^{-2}]$ ) from $z=1$ to $z=0$ compared to a single major merger event ( $\xi>5\times 10^{-2}$ ), yet halos gain most of this mass in a single major merger, $18\%$ compared to $5\%$ . The fit predicts halos should acquire $\approx 20\%$ and $\approx 79\%$ of their mass integrated over cosmic time through minor and major mergers respectively. In agreement with this prediction, we find $20\pm 10\%$ and $31_{-13}^{+46}\%$ are accreted through these two channels999The prediction ignores “smooth” mass accretion whereas halos in our simulation do accrete material not contained within smaller halos, accounting for $\approx 46\%$ of a halo’s mass growth..

The total mass accreted in merger events and its evolution is shown in Fig. 14, where we split halos into three different $z=0$ mass bins. At all times, halos principally grow through minor/major mergers, on average gaining $\sim 85\%$ of their mass during such events. The overall amount of mass acquired through high mass ratio mergers is greater than that acquired in mini mergers. Initially, it appears objects only grow through minor/major mergers at high redshift, however, this is partially due to finite resolution.

The influence of resolution can be seen by comparing results from our reference L40N512 to our larger volume, lower mass resolution simulation, L210N1536. In the lowest mass bin, $10^{11}h^{-1}{\rm M}_{\odot}$ , halos are composed of $700-7000$ particles in L40N512 compared to $140-1400$ in L210N1536, with the bin dominated by lower mass objects (as a result of the mass function). The total mass gained by $z=0$ is similar for these halos in both simulations, $58^{+31}_{-20}\%$ compared to $54_{-19}^{+28}$ , where the uncertainties indicate the halo-to-halo scatter. However, the amount of material accreted through mini mergers in the L210N1536 simulation is significantly reduced. The halos in L210N1536 are not well resolved enough to follow mini mergers and only experience mini mergers at late times, after $z<1$ , gaining only $2\%$ of their mass via this channel and only for the largest halos in this mass bin. Improving the mass resolution results in mini-mergers occurring as early as $z\sim 3$ , with halos gaining $14_{-8}^{+12}\%$ of their mass through mini mergers. This will impact the internal mass distribution of halos as minor/major mergers centrally deposit their mass.

The mass accretion also shows the imprint of a finite simulation volume, particularly at late times. The largest halos in the smaller volume L40N512 run have a smaller amount of mass acquired through minor/major mergers than similar mass halos in the larger volume L210N1536 run, with few objects experiencing major mergers after $z=0.5$ . The total amount of mass acquired in these minor/major mergers is $26_{-15}^{+23}\%$ compared to $36_{-17}^{+32}\%$ . The inclusion of large-scale power in L210N1536 gives rise to rarer density peaks, altering the mass accretion rate onto these peaks. Klypin & Prada (2018) show that the $z=0$ halo mass function at high masses is suppressed in smaller simulation volumes, particularly at cluster mass scales in agreement with theoretical predictions (also see for instance Bagla & Prasad, 2006; Warren et al., 2006; Tinker et al., 2010; Schneider et al., 2016; Comparat et al., 2017, for discussion of finite volume effects on power-spectra and the halo mass function). Finite volume effects on the mass accretion history has yet to be thoroughly investigated and is beyond the scope of this paper.

4 Discussion and Conclusion

We have presented TreeFrog, a code designed to follow the evolution of cosmic structures like halos and subhalos. We have demonstrated that the code tracks (sub)halos across cosmic time, particularly cases that are typically notoriously difficult for such codes, namely major mergers. We summarise key features and results below.

TreeFrog is a tree builder code that can take a variety of halo catalogue inputs. At its core, it is a particle correlator, using particle IDs to match halo catalogues. Used in concert with VELOCIraptor (or any halo finder where the input particle lists are arranged in a meaningful order), it uses a combination of merit functions along with a subset of the most bound particles to determine the best matches. We have shown that searching multiple snapshots for candidate descendants based on the combined rank ordering/number of shared particles merit using $\sim 50\%$ of the most bound particles well reconstructs the accretion histories of objects with complex interactions (those that experience major mergers and host significant amounts of substructure), even for objects which contain substructure and become subhalos of a larger host.

The combined merit used by TreeFrog better separates primary progenitors/descendant links from secondary ones, with the primary and secondary merit distributions overlapping at the $2.5\%$ level, unlike the commonly used number of shared particles based merit, where the distributions overlap at the $7\%$ level. Searching multiple snapshots for possible descendants is critical, reducing the number of very poorly resolved objects with artificially truncated lives from $\sim 10\%$ to $\sim 2\%$ . The reduction in truncation events and other tree pathologies advocates the need to run simulations into the future, past the desired last redshift, a practice also advocated by Poole et al. (2017).

The net result is that only a small fraction of objects either start or end life composed of too many particles. Less than $1\%$ of objects begin their lives composed of $\gtrsim 100$ particles, above the halo catalogue particle limit of 20. A negligible fraction of objects, $\sim 10^{-4}$ , composed of $\sim 40$ particles have artificially truncated lives, ending with no descendant. Typically, these objects are poorly resolved halos in the process of being tidally disrupted that are falling towards another halo.

With well built trees, we find a significant difference in the merger behaviour of small subhalos and major merger remnants. Mergers, those objects that are accreted by another halo with accretion mass ratios of $\gtrsim 5\times 10^{-2}$ fully coalesce or merge at smaller radii that subhalos, those with accretion masses ratios of $\lesssim 10^{-2}$ , due to the effect of dynamical friction. These objects do not experience significant tidal mass loss, the slow stripping of outer less bound material. Instead, they begin to phase-mix, with the mass assigned to the merger remnant rapidly being assigned to the host halo once they move close to or inside the scale radius of the host halo.

We find that the mass accretion history of a host halo is dominated by major mergers. In agreement with previous studies, we find $20\pm 10\%$ of the mass is accreted through minor mergers and $31_{-13}^{+46}\%$ through major mergers. The reconstructed merger rate from our low resolution simulation is in agreement with those from Poole et al. (2017) using much high resolution simulations.

The general particle correlator nature of TreeFrog means that it has been used to not only construct halo merger trees but void trees Sutter et al. (2014) and even compare halos across simulations with different subgrid physics (e.g. Elahi et al., 2016; Arthur et al., 2017).

The process of developing TreeFrog has lead to the spin-off of two standalone packages WhereWolf, an halo tracking tool for halo merger trees which fills in gaps in the tree and follows objects deemed to have merged; and OrbWeaver, a tool to reconstruct orbital evolution, that will be presented in a follow-up paper (Poulton in prep). The former corrects gaps and mergers in the tree by tracking particles belonging to the (sub)halo that has a gap in the tree or has merged, to see if the (sub)halo still exists. The latter reconstructs the orbital evolution of halos.

Acknowledgements.

RP is supported by a University of Western Australia Scholarship. RC is supported by the SIRF awarded by the University of Western Australia Scholarships Committee, and the Consejo Nacional de Ciencia y Tecnología (CONACyT) scholarship No. 438594 and the MERAC Foundation. Parts of this research were conducted by the Australian Research Council Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), through project number CE170100013. CL is funded by a Discovery Early Career Researcher Award DE150100618. CL also thanks the MERAC Foundation for a Postdoctoral Research Award. The authors contributed to this paper in the following ways: PJE ran simulations and analysed the data, made the plots and wrote the bulk of the paper. PJE is the primary developer of TreeFrog. RP & RT designed and developed various aspects of the code: RP tested the code and motivated the development of the descendant tree; RT developed the compilation infrastructure. RC, CL, CP, & AR assisted in the design of various aspects of the code. All authors have read and commented on the paper.

Facilities

Magnus (Pawsey Supercomputing Centre)

Software

•

VELOCIraptor https://github.com/pelahi/VELOCIraptor-STF

•

TreeFrog https://github.com/pelahi/TreeFrog

•

NBodylib https://github.com/pelahi/NBodylib

•

VELOCIraptor_Python_Tools https://github.com/pelahi/VELOCIraptor_Python_Tools

•

MergerTreeDendograms https://github.com/rhyspoulton/MergerTree-Dendograms

Additional Software

Python, Matplotlib Hunter (2007), Scipy Jones et al. (01), emcee Foreman-Mackey et al. (2013), SciKit Pedregosa et al. (2011), Gadget Springel (2005)

Appendix A Merit Function Comparison

The ideal merit function is one which well separates primaries and secondaries, or more formally, the merit distribution of primary links differs significantly from that of secondary links. We can compare distributions in a variety of ways. One commonly used measure of the similarity of probability distribution functions is the Bhattacharyya distance, $D_{BC}$ , which is related to the Bhattacharyya coefficient, $BC$ Bhattacharyya (1943). For probability distributions $p$ and $q$ defined over a domain $X$ , the coefficient and distances are:

[TABLE]

Distributions with no overlap have $D_{BC}=\infty$ .

Another common distance measure is the $1^{\rm st}$ Wasserstein distance or so-called Earth Mover’s Distance, which measures the minimum work needed to be done to transform $p\rightarrow q$ (e.g. Rubner et al., 1998). Consider a set $P=(p_{1},p_{2},\dots,p_{m})$ and $Q=(q_{1},q_{2},\dots,q_{n})$ with a set of distances $D=[d_{i,j}]$ , where $d_{i,j}$ is the ground distance between $p_{i}$ and $q_{j}$ . The earth mover’s distances $D_{EMD}$ is determined by finding the optimal flow $F=[f_{i,j}]$ that minimises the overall cost of moving set $P$ to $Q$ ,

[TABLE]

subjected to the constraints:

[TABLE]

The earth mover’s distance is defined as the work normalised by the total flow:

[TABLE]

The merit function should produce the maximum distance between primary and secondary distributions. We show the product of the Bhattacharyya distance and the Wasserstein distance in Fig. 15. This figure clearly shows how using ranking can improve the separation between primaries and secondaries and how using only the most bound particles can significantly improve classification.

Appendix B Configuration Options

We list the complete configuration options here.

Appendix C Associated Tools

TreeFrog comes with a Python-2/3 tool-kit, specifically routines to manipulate the output data produced by the various codes. Typically, these produce dict containing numpy arrays, allowing for quick analysis and plotting. The repositories also come with examples of producing metric plots. The codes are Python-3 (compatible with Python-2) and make use of numpy, h5py, scipy, matplotlib, and scikit.learn.

Bibliography49

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Arthur et al. (2017) Arthur J., et al., 2017, MNRAS , 464, 2027 · doi ↗
2Avila et al. (2014) Avila S., et al., 2014, MNRAS , 441, 3488 · doi ↗
3Bagla & Prasad (2006) Bagla J. S., Prasad J., 2006, MNRAS , 370, 993 · doi ↗
4Baugh et al. (2018) Baugh C. M., et al., 2018, preprint, ( ar Xiv:1808.08276 )
5Behroozi et al. (2013) Behroozi P. S., Wechsler R. H., Wu H.-Y., Busha M. T., Klypin A. A., Primack J. R., 2013, Ap J , 763, 18 · doi ↗
6Benson (2017) Benson A. J., 2017, MNRAS , 467, 3454 · doi ↗
7Bhattacharyya (1943) Bhattacharyya A., 1943, Bull. Calcutta Math. Soc., 35, 99
8Boylan-Kolchin et al. (2009) Boylan-Kolchin M., Springel V., White S. D. M., Jenkins A., Lemson G., 2009, MNRAS , 398, 1150 · doi ↗