Consistency relations for the Lagrangian halo bias and their implications
Kwan Chuen Chan, Ravi K. Sheth, Roman Scoccimarro

TL;DR
This paper develops a method using consistency relations among halo bias factors to infer physical properties of protohalos, improving understanding of halo formation and bias modeling.
Contribution
It introduces a framework leveraging consistency relations to connect bias measurements with protohalo physics, including assembly bias effects, with a focus on a two-parameter model.
Findings
Effective two-parameter model accurately describes protohalo-matter cross-correlation.
Consistency relations enable inference of enclosed density and slope from large-scale bias.
Model predictions agree well with direct small-scale measurements.
Abstract
The protohalo patches from which halos form are defined by a number of constraints imposed on the Lagrangian dark matter density field. Each of these constraints contributes to biasing the spatial distribution of the protohalos relative to the matter. We show how measurements of this spatial distribution -- linear combinations of protohalo bias factors -- can be used to make inferences about the physics of halo formation. Our analysis exploits the fact that halo bias factors satisfy consistency relations which encode this physics, and that these relations are the same even for sub-populations in which assembly bias has played a role. We illustrate our methods using a model in which three parameters matter: a density threshold, the local slope and the curvature of the smoothed density field. The latter two are nearly degenerate; our approach naturally allows one to build an accurate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Consistency relations for the Lagrangian halo bias and their implications
Kwan Chuen Chan,1 Ravi K. Sheth,2 Román Scoccimarro3
1 Institute of Space Sciences, IEEC-CSIC, Campus UAB, Carrer de Can Magrans, s/n, 08193 Bellaterra, Barcelona, Spain
2 Center for Particle Cosmology, University of Pennsylvania, 209 S. 33rd St., PA 19104, Philadelphia, USA
3 Center for Cosmology and Particle Physics, Department of Physics, New York University, NY 10003, New York, USA E-mail: [email protected] (KCC)
(Accepted XXX. Received YYY; in original form ZZZ)
Abstract
The protohalo patches from which halos form are defined by a number of constraints imposed on the Lagrangian dark matter density field. Each of these constraints contributes to biasing the spatial distribution of the protohalos relative to the matter. We show how measurements of this spatial distribution – linear combinations of protohalo bias factors – can be used to make inferences about the physics of halo formation. Our analysis exploits the fact that halo bias factors satisfy consistency relations which encode this physics, and that these relations are the same even for sub-populations in which assembly bias has played a role. We illustrate our methods using a model in which three parameters matter: a density threshold, the local slope and the curvature of the smoothed density field. The latter two are nearly degenerate; our approach naturally allows one to build an accurate effective two-parameter model for which the consistency relations still apply. This, with an accurate description of the smoothing window, allows one to describe the protohalo-matter cross-correlation very well, both in Fourier and configuration space. We then use our determination of the large scale bias parameters together with the consistency relations, to estimate the enclosed density and mean slope on the Lagrangian radius scale of the protohalos. Direct measurements of these quantities, made on smaller scales than those on which the bias parameters are typically measured, are in good agreement.
keywords:
large-scale structure of Universe
††pubyear: 2016††pagerange: Consistency relations for the Lagrangian halo bias and their implications–A.4
1 Introduction
Halos and the galaxies they host are biased tracers of the dark matter density field (Kaiser, 1984; Bardeen et al., 1986; Mo & White, 1996). To extract cosmological information from galaxy surveys, this bias must be understood. In the best studied models of halo formation, the bias is a consequence of the fact that the protohalo patches from which halos form are defined by a number of constraints imposed on the initial Lagrangian dark matter density field. Each one of these constraints contributes to biasing the spatial distribution of the protohalos relative to the matter. Thus, in principle, there is valuable halo formation physics hidden in the bias parameters. In fact, the bias parameters satisfy a hierarchy of consistency relations (Musso et al., 2012; Paranjape et al., 2013). These consistency relations not only allow us to check the self-consistency of the bias prescription, but they also potentially open the road to learning about the physics of halo formation. The main goal of this paper is to demonstrate that we can indeed extract information about the small scale physics of halo formation from measurements of the large scale clustering of halos. In particular, our methodology allows one to estimate if assembly bias effects, of the sort first identified by Sheth & Tormen (2004), are present in the halo population. In this respect, our work complements that of Castorina et al. (2016a); Castorina et al. (2016b); whereas they used configuration space methods to address similar issues, we, like Modi et al. (2016), use Fourier space measurements.
This paper is organized as follows. We provide a new derivation of the consistency relations for the linear Lagrangian bias parameters using a straightforward linear algebra method in Sec. 2.1. Since protohalos are extended objects, we discuss the importance of smoothing in Sec. 2.2. In Sec. 2.3 we illustrate our arguments with some specific examples, which we use to address the question of degeneracies between parameters and effective versus exact models of the physics. After showing the measurements of the correlations in configuration and Fourier space in Sec. 3.1, in Sec. 3.2 we discuss our direct estimates of quantities which are thought to matter for the small scale physics of halo formation, some of which are novel. In Sec. 3.3, we use a two-bias parameter model to fit measurements of the Fourier space bias signal. The halo collapse threshold inferred from the consistency relation is compared with the direct measurements of the overdensity within Lagrangian protohalos in Sec. 3.4, and a similar test of the profile slope is in Sec. 3.5. We revisit the physical meaning of the consistency relations in Sec. 3.6. We summarize our findings and conclude in Sec. 4. An Appendix is devoted to the study of the correlation function in real space, and shows that although bias parameters may depend on smoothing window, the combination which matters for the consistency relation does not.
2 Lagrangian constraints, bias and consistency relations
Although ultimately we are interested in halos in Eulerian space, as they are potentially observable, the modelling of halo properties often starts in Lagrangian space. This is primarily because the statistics of the Lagrangian field are easier to describe, particularly if the initial conditions were Gaussian. The best-studied models are the peak (Kaiser, 1984; Bardeen et al., 1986; Desjacques & Sheth, 2010), excursion set (Press & Schechter, 1974; Bond et al., 1991; Musso et al., 2012) and excursion set peak (Appel & Jones, 1990; Paranjape & Sheth, 2012; Paranjape et al., 2013; Biagetti et al., 2014; Dizgah et al., 2016) approaches. In all three approaches, describing how the initial Lagrangian protohalos evolve to form the final Eulerian halos is a separate step. In what follows, we do not consider this second step, except to point out that the way forward is described in Desjacques et al. (2010). For a recent review on halo bias, see Desjacques et al. (2016).
2.1 General formalism
In one of the simplest models, a protohalo is identified with any position where the smoothed field exceeds a (physically motivated) threshold. The peak model adds the additional constraints that the spatial gradient of the smoothed field should vanish and that the curvature should be negative. These constraints on the scale of the protohalo patch impact correlations between the protohalo centers and the large-scale matter distribution. Our goal is to extract these constraints from large-scale cross correlations.
However, to do so, we must first address the fact that ‘smoothing’ is common to all halo formation models. I.e., it is the average properties of the field centered on a patch in the initial conditions which determine whether or not it will become a halo. Therefore, the shape of the smoothing window is expected to play an important role. For this reason, it is important to note that the analysis which follows is generic for all window choices; we will only specify our choice of window in the next subsection.
We suppose that the constraints are given by the vector . If the constraint variables are set to be , then the expectation value of the large-scale field given the constraints is
[TABLE]
where is the distribution of conditioned on . In general, we must integrate over the distribution of the constraint as well:
[TABLE]
We will often assume that is the density field smoothed on a large scale at the same spatial position at which the constraints were specified, but we can take it to be other fields at other positions if we wish. If is taken to be the density, would be the profile of the Lagrangian halos. The large-scale field only serves as a surrogate for extracting the smaller scale halo formation physics. This physics is encoded in the constraint , which we express in terms of normalized (zero-mean unit variance) random variables. For example, it can be , where is the peak height and its curvature. The Fourier transforms of these quantities are defined as
[TABLE]
where is (a Fourier mode of) the dark matter density contrast, and is the smoothing window. The spectral moment is defined as
[TABLE]
where is the linear power spectrum evaluated at . In what follows, we will also be interested in the ‘slope’ variable
[TABLE]
where
[TABLE]
In real space, these variables are
[TABLE]
where we have denoted the smoothed dark matter density field by . These variables illustrate a few of the ways in which the smoothing window appears in the constrained variables. We are now ready to consider generic relations between the constraints.
The constrained Gaussian field is still Gaussian and its mean and variance are well-known (see e.g. Appendix D in Bardeen et al., 1986). In particular when is constrained to have some specific values , then the conditional mean of is
[TABLE]
where the vector denotes the cross correlation between and the constraint variables, and is the covariance matrix between the constraint variables.
Clearly, the right hand side of Eq. 11 is a sum of many terms. If we multiply first, then the terms will be grouped according to . For bias, we instead wish to group terms by their scale dependence, which means we group by the elements of . We define the linear bias coefficients as
[TABLE]
where is the linear bias coefficient
[TABLE]
(Note that is not summed over in Eq. 13.) In Eq. 12 we have divided by and introduced so that the dimension of the resultant bias parameter agrees with the usual expansion in the density contrast. Note that appears quite differently from other variables, so we shall treat it differently (e.g., when we consider bias at different times). Sometimes it is advantageous to use the variable because it can be completely expressed in terms of the normalized dimensionless variables. We also stress that the defined here have not (yet) been averaged over the constraint, i.e. they explicitly depend on .
Eq. 13 is interesting because it enables us to express the bias parameters directly in terms of the constraints , making the bias problem simply a linear algebra problem. In particular, we can invert Eq. 13 to express in terms of the :
[TABLE]
There are relations between the bias parameters, where is the dimension of , and they simply reflect the underlying constraints on halo formation. Note that the bias in Eq. 13 and the consistency relations in Eq. 14 only depend on , but not . This echoes our previous assertion that the large-scale field is only used to extract the constraint .
In addition, recall that we have yet to average over the constraint variables. Since averaging is a linear operation, our analysis also shows that partial averages will also satisfy similar consistency relations. As we discuss later, this means that our analysis is immune to what is known as ‘assembly bias’. I.e., although the numerical values of ‘assembly biased’ bias factors may be different from those of the parent population, the algebraic relations between the bias factors – Eq. 14 – will be the same as for the parent population.
The full consistency relations are already given by Eq. 14. However, the first of these consistency relations for is particularly simple. By putting in Eq. 14, we get
[TABLE]
Eq. 15 is interesting because it shows that the value of the constrained variable can be determined if one has measured the bias parameters . This shows explicitly why measurements of the large scale bias can be used to learn about the small-scale physics of halo formation. Notice in particular that if the constrained variables must be averaged over, then this will yield averaged values of whose values will depend on the range over which the average was taken. Inserting these averaged values in Eq. 15 yields the corresponding averaged value of . Since the same procedure works whatever the range over which the average was taken, Eq. 15 shows that the bias factors of an ‘assembly biased’ population can be used to infer what was the physics which led to the assembly bias. While one might have worried that the procedure for recovering the physics of formation might have been different for each sub-population, the analysis above shows that this is not the case.
Eq. 15 has been derived previously, in the context of a specific model in which , following a rather different approach (Musso et al., 2012; Paranjape et al., 2013), where it was called a ‘consistency’ relation. Our analysis shows how to extend the consistency relation to arbitrary and arbitrary constraints. On the other hand, the analysis above is for the first order (sometimes called ‘linear’) bias parameter. There are consistency relations associated with higher order bias parameters (Musso et al., 2012; Paranjape et al., 2013). Based on the results of Musso et al. (2012), we write down the high order consistency relations in Sec. 3.6. Castorina & Sheth (2017) show how to extend these to arbitrary variables and constraints using a generating function approach. So it would be interesting to extend the simple linear algebra method developed here to describe higher order bias. However, doing so is beyond the scope of this work.
2.2 Smoothing windows
The analysis above is valid regardless of the form of . We now turn to the question of what one should use for . We expect it to have a characteristic scale , so its Fourier transform will be a function of the combination . In what follows, we will use the effective window function proposed in Chan et al. (2015), the Fourier transform of which is
[TABLE]
where and denote the top hat and Gaussian window function respectively:
[TABLE]
There are a number of reasons why we will work with , rather than or . But before we list them, it is worth noting that although all three filters tend to unity at , they are not all equally compact. A crude measure of the extent of these filters is given by expanding to lowest order in . Then , , and , which suggests that, if we wanted to replace by a tophat or a Gaussian, then we should set and . Thus, the tophat is the most compact of the three, and the Gaussian is the most extended. This difference will be important below.
The first reason we like is purely technical. Although many of the expressions above are formally well-defined for an arbitrary smoothing window, in practice, some lead to divergences. For example, defined in Eq. 5 plays an important role in peak theory. However, for , the integral above diverges if , whereas it is well-behaved for .
The second is more physical. With , Chan et al. (2015) were able to provide an accurate description of protohalos in simulations using just two bias parameters: those associated with and . In contrast, with , just and are not enough to accurately describe protohalos. That is to say, allows us to use the consistency relations above to constrain halo formation. I.e., of the potentially vast number of physical parameters which may matter for halo formation, allows us to work with just . This is a very useful simplification.
Finally, allows us to illustrate some subtle issues associated with our approach. For example, for a Gaussian smoothing window because
[TABLE]
Therefore, for general smoothing windows, such as our Eq. 16, we expect to find that although and are formally different, in practice, constraints on may be rather degenerate with those on .
To illustrate this point, Fig. 1 shows the cross correlation coefficients
[TABLE]
as a function of smoothing scale (recall halo mass ). Notice that , and and are similar in magnitude, as expected (recall that for Gaussian smoothing). In what follows, we will use this fact to address a number of issues which arise when different physical parameters are nearly degenerate.
2.3 Specific examples
It is common to set , although this is not necessary. In the simplest peak model, a proto-halo patch satisfies three constraints: the value of the enclosed overdensity (the height), the first derivative with respect to spatial position (which must be zero to be a local peak), and the second derivative with respect to spatial position (the peak curvature) (Bardeen et al., 1986). In the simplest excursion set peak model, there is an additional constraint on the first derivative with respect to smoothing scale (the excursion set slope) (Appel & Jones, 1990; Musso & Sheth, 2012; Paranjape & Sheth, 2012). More sophisticated models include the shear and the shape (e.g. Sheth et al., 2013; Biagetti et al., 2014; Castorina et al., 2016b); we will ignore these complications.
The first spatial derivative is independent of the other three variables, and it is constrained to be zero anyway. So, in practice, for the peak model and for excursion set peak. It is typical to order these as , where the three variables have zero mean, and they have been normalized by the rms values of the height, curvature and slope, respectively. Therefore, has unity along its diagonal. In addition, since , we will use this example to show how our formulation of consistency relations works when , as well as to address the question of what one should do when different physical parameters are nearly degenerate.
E.g., if and are nearly degenerate, shouldn’t the bias factors for each be similar? If so, how is it that the consistency relation does not double count the contribution from the two terms? Alternatively, what form does the consistency relation take if fundamentally both and matter, but we suppose that only one of them matters? Can we effectively group the and terms together in the consistency relations? Such questions are relevant to the issue of efficiency in describing the halo distribution: How many variables are sufficient to capture the physics? In what follows, we will consider both the 2-parameter and 3-parameter models.
The previous section showed that linear combinations of the bias factors allow one to isolate the dependence of halo formation on , and . To see this in practice, suppose that and . Then
[TABLE]
where would typically be given by the spherical collapse (SC) value
[TABLE]
with
[TABLE]
where is the SC threshold and is the linear growth factor. Eq. 23 is the expression for biased tracers which appears in Kaiser (1984). If the biased tracers were constrained to have rather than , then, because the constraint would include an integral over , the bias factor would also be given by integrating over the allowed range, and would also be replaced by its average over the range. Note that for notational simplicity we have denoted the smoothed dark matter density simply as , and the window function is implicitly included in and the constraint variables.
The next non-trivial case is , with in the peak model, and in the excursion set approach, where is the overdensity as before, its slope with respect to smoothing scale, and its curvature with respect to spatial position. E.g., in the peak model,
[TABLE]
and the bias expansion is given by
[TABLE]
with the bias parameters being
[TABLE]
where , and denote the constrained values. The consistency relations read
[TABLE]
For the excursion set approach, simply replace and .
Notice that , just as when . However, here will be different from when if . Again, allowing a range of values means one must replace and by their averaged values, and should also be replaced by its average.
E.g., a simple model of assembly bias would be to assume that all objects have the same height , but different curvatures or slopes (Dalal et al., 2008; Musso & Sheth, 2014; Lazeyras et al., 2016). Eq. 28 and 29 show that populations with different will have different bias parameters and . However, because we have assumed they have the same height , the sum (the first consistency relation) will be the same for the different populations. On the other hand, because the left hand side of the second consistency relation equals , this one will be different for the different populations. Thus, measuring the large scale bias factors and for the populations and then using the two consistency relations in Eq. 30 allows one to discover that the populations only differ in curvature and not in height. Since this argument obviously holds if the different populations are constrained to have different ranges in and , this example shows explicitly how the same methodology can be used to learn about the physics of assembly bias, without having a priori knowledge that it was even present in the first place. Conversely, if we have some way to directly measure and , e.g. using the direct measurements outlined in Sec. 3, then the consistency relations provide a way to detect assembly bias.
Finally, we consider with ; recall that is associated with the excursion set upcrossing constraint on the slope of the smoothed density field (Eq. 6). The bias expansion reads
[TABLE]
and Eq. 14 becomes
[TABLE]
In this form it is easy to see that the consistency relation is satisfied. Clearly, our previous remarks about assembly bias apply here too. Our main goal in this example is to illustrate what happens when some of the parameters, in this case and , are nearly degenerate. E.g., for Gaussian smoothing , and so the expressions above clearly depend only on and , and the relations for and are the same.
The bias factors, written explicitly in terms of the correlation coefficients of the constrained variables, are
[TABLE]
where
[TABLE]
and
[TABLE]
The first terms on the rhs of the expressions for and are the same as when (compare Eq. 28 and 29), so it appears to be straightforward to address the question of the impact of the third variable. However, when , both and var vanish, thus and hence and are indeterminate. As we discussed, in this limit, the problem indeed reduces to the case, and we can choose the second variable to be either or . We will return to this shortly.
We now turn to the question of estimating the bias coefficients from cross-correlations with the dark matter density field . The scale dependence of bias comes from the interplay of three terms: , and , which are defined by
[TABLE]
(We note in passing that in these expressions there is a window function only for the protohalo/peak; there is no window on . Also, for a Gaussian window, so we expect this to remain approximately true for other windows.) Fourier transforming Eq. 2.3, and dividing by the cross correlation between and the dark matter density contrast , yields
[TABLE]
Fig. 2 compares the scale dependence of the three terms on the rhs when is given by Eq. 16. For , only the first term is non-vanishing: hence, can be determined from the bias signal at . The other two terms are rather similar. Therefore, when Eq. 2.3 is used to fit the -dependence of the measured bias signal, the derived constraints on and are likely to be degenerate, along a locus of approximately constant .
Having shown why and are approximately degenerate, we are ready to reconsider the issue of var. It is particularly instructive to consider the following variables
[TABLE]
These variables are uncorrelated with each other, and this simplifies the analysis. By analogy with Eq. 33-37, the bias factors are
[TABLE]
where
[TABLE]
and
[TABLE]
It is again easy to see that the consistency relation is the same as before. However, now var is well-behaved, so it is clear that, when (when , is still indeterminate), then and the system reduces to one involving just two constrained variables, and .
Using Eq. 12, the corresponding effective bias expansion reads
[TABLE]
where
[TABLE]
(In the sign, and are for and respectively.)
The quantities which actually matter for the bias expansion are . Fig. 3 shows that for , and so the term can be neglected. Therefore, when fitting to data at , it should be a good approximation to neglect the variable and use only the effective two-parameter model. This is important, as when fitting data, one does not know a priori how many variables are important for the physics. The fact that means that working with the effective two-variable problem will not yield biased results.
3 Cross correlations and Lagrangian bias consistency relations
As discussed in the previous section, the cross correlation between the dark matter field and the Lagrangian protohalos encodes the constraints used to define the Lagrangian protohalos. We shall check this using Lagrangian protohalos obtained in numerical simulations. To do so, we identify Eulerian halos at some redshift , and trace the constituent particles in each halo back to the initial redshift . The position of the Lagrangian protohalo is estimated using the center of mass of the particles in the Lagrangian space.
In this work, we use two sets of simulations from the LasDamas project, denoted by Oriana and Carmen respectively. Both sets assume a flat CDM model with the cosmological parameters, , and . The initial conditions are Gaussian with spectral index , and transfer function output from CMBFAST (Seljak & Zaldarriaga, 1996). The initial displacement fields are set using 2LPT (Crocce et al., 2006) at . The simulations are evolved using the public code Gadget2 (Springel, 2005). In the Oriana simulations, there are particles in a cubic box of size 2400 . For the Carmen simulations, there are particles in a box of size 1000 . Thus the particle mass in Oriana and Carmen is and respectively. We use 5 realizations from Oriana and 7 from Carmen. In each, the Eulerian halos are identified using a Friends-Of-Friends algorithm (Davis et al., 1985) with linking length at and 0. We only consider halos with at least 65 particles. We bin the halos into thin narrow mass bins of width . To avoid contamination from evolution, by Lagrangian space we mean the initial Gaussian density field evaluated at the grid positions instead of the displaced 2LPT field.
3.1 Lagrangian cross-correlations in configuration and Fourier space
We measure cross-correlations between the initial Lagrangian protohalo positions and the dark matter at the initial time using configuration space and Fourier methods.
The configuration space signal, is simply the mean density profile around the Lagrangian halo centre (e.g. Peebles, 1980). Fig. 4 shows this cross-correlation for a range of narrow mass bins; much of the dependence on mass is removed if distances are scaled by , which is defined as
[TABLE]
( is the comoving density of the dark matter). However, there is some residual mass dependence: drops slightly more rapidly for lower mass halos. The plot shows that, within , is larger for less massive halos, but the mass trend reverses beyond : i.e., less massive protohalos have steeper profiles. This is in qualitative agreement with a generic prediction of peak theory (Sheth, 1999). In the present context, this confirmation of the peak prediction is interesting, but it is only a means to an end.
The cross power spectrum between the initial Lagrangian protohalo overdensity field , and that of the dark matter at the initial time, , is defined as
[TABLE]
where is the Dirac delta function. Note that is the Fourier transform of , and that here should not be confused with , the overdensity within a protohalo patch, which played a major role in the previous section.
Since the Lagrangian matter power spectrum is
[TABLE]
we define the Lagrangian cross bias parameter
[TABLE]
where is the linear growth factor. Note that we extrapolate the Lagrangian bias parameter to using linear evolution of bias. The presence of unity in Eq. 53 is due to the finite initial redshift in simulation (see, e.g., Chan et al., 2012). The argument denotes the redshift at which the Eulerian halos were identified.
3.2 Direct estimates of the mean enclosed overdensity and slope
The consistency relations for the enclosed overdensity and its slope, both evaluated on the scale , state that
[TABLE]
(compare Eq. 30). We test these relations by measuring the quantities on the left- and right-hand sides.
The enclosed overdensity is obtained by smoothing the density profile, so it is just
[TABLE]
Here is the inverse Fourier transform of the smoothing window defined earlier, so it has units of inverse volume. E.g., for a tophat, , and for , see Chan et al. (2015). I.e., is the cross correlation shown in Fig. 4, smoothed by the effective window function.
Similarly, the slope variable is
[TABLE]
where is the overdensity within a protohalo patch when smoothed on scale (it is not defined in Eq. 51!). For reasons which will become clear shortly, Fig. 5 shows a related measure of the slope: . As for , normalizing distances by removes most of the mass dependence. For all masses, we find that the slope has an obvious maximum on scales that are very close to , and this maximum is larger for the less massive halos. Also note that there is an additional small dependence on . The maximum feature in Fig. 5 arises because there is an intrinsic window function in , and changes rapidly when the external window function has large overlap with the intrinsic one. This happens when , and hence a peak results.
The quantities which are most directly related to the consistency relations are the mean enclosed overdensity and slope on scale . While the previous expressions show how to estimate them from , they can also be written as sums over Fourier space quantities:
[TABLE]
and
[TABLE]
Figs. 6 and 7 show that our direct estimates of and using are in excellent agreement with the more traditional estimators which are based on . Both estimators in Fig. 6 show that the enclosed overdensity is larger for smaller masses (strictly speaking, for smaller ). This is in agreement with previous direct measurements of the mean overdensity within protohalo patches (Sheth et al., 2001; Robertson et al., 2009; Elia et al., 2012; Despali et al., 2013).
With these direct estimates of the overdensity and slope in hand, we are now ready to estimate the other quantities which appear in the consistency relations: large-scale bias factors.
3.3 Lagrangian cross bias parameters in Fourier space
The symbols in Fig. 8 show the Lagrangian protohalo bias parameters for halos identified at and 0 for a range of halo masses (as indicated) obtained using Eq. 53 from the measurements. Some of the mass dependence is removed because we plotted against rather than . This plot is similar to Fig. 5 in Chan et al. (2015). There we used it to highlight the fact that Eq. 16 is crucial for estimating the Lagrangian bias parameters reliably, but we did not go into the details of the biasing model. Here we instead focus on the best fit bias parameters, and explore the consistency relations among them.
Our goal here is to model the measurements of shown in Fig. 8. We start with the simple two-parameter model for the Lagrangian cross bias :
[TABLE]
As we mentioned, the two bias parameters and arise from the density threshold and slope constraints (Musso et al., 2012, except that our is what they call ). We have changed notation from in Sec. 2 to , etc., to highlight the fact that the biases in Sec. 2 are un-averaged, while those measured in the simulations are averaged over the constraints as in Eq. 2.
The smooth black curves in Fig. 8 show the result of treating , and as free parameters when fitting Eq. 59 to the measurements. In practice, to ensure that the low- part, which has large error bars relative to the high -region, is properly fitted, we first determine by fitting to the low constant part up to . We can do this because and the scale-dependent term vanishes at (c.f. Fig. 2). We then keep the best-fit fixed and fit the remaining parameters and . Evidently, this simple model for is able to provide a good fit over the entire range of .
Fig. 9 shows the best-fit parameters, , , and as a function of halo mass, expressed in terms of where large has large ). Note that uses Eq. 16 for the smoothing window, and the dependence of allows us to easily compare measurements for halos identified at different redshifts (in this case and 0). The bias parameters from different coincide with each other rather well, in agreement with previous work (Sheth & Tormen, 1999). We find that, although the best fit is close to , the best-fit decreases as increases. For low , especially , the model does not work very well. There are additional small differences between the results from Carmen and Oriana simulations.
In principle, we should be able to use our Fourier space estimates of (essentially the solid curves in Fig. 8) to predict the measured of Fig. 4. Since this is not the main focus of our study, we only show the result in Fig. 14.
In addition, it happens that our measurements of and are reasonably well described by the excursion set peak formulae of Paranjape et al. (2013) provided one assumes halos formed from an ellipsoidal collapse with some stochasticity. In that model, a stochastic term proportional to is added to the spherical collapse barrier so that the barrier increases as decreases. The amount of stochasticity is controlled by the free parameter . The cyan band shows the range spanned by their models with . The yellow curve shows (spherical collapse and no stochasticity). While the general agreement is encouraging, constraining specific collapse models is not the main focus of our study (for more discussion of models, see Castorina et al., 2016b). Rather, our goal is to test the accuracy of the consistency relation, Eq. 54.
3.4 Consistency relation for the smoothed enclosed overdensity
The first consistency relation of Eq. 54 equates the sum of the large scale bias factors to , where is the mean overdensity within the protohalo patch. The red and green symbols in the upper panel of Fig. 10 show our direct estimates of using Eq. 55 and Eq. 57 with (i.e., they are the lower set of symbols in Fig. 6). The blue symbols and curves show our estimate of from the large scale bias factors: . The lower panel shows that the consistency relation estimate of is within 4% of the direct estimate (we used the one shown by the green symbols). This agreement is remarkable, given that the two estimates have very different systematics: that based on and is derived from two-point measurements on relatively large scales, whereas the traditional view of the direct measure is that it is more like a one-point measurement on substantially smaller scales. We will have more to say about this agreement in Section 3.6. For now, we simply note that the consistency relation has used just large scale bias factors to correctly predict the enclosed Lagrangian overdensity of the protohalos measured on much smaller scales.
It is possible that the slight discrepancy which is apparent at lower is indicating that our model, which assumes there are only two important parameters, is overly simplistic. We argued that, in principle, our analysis permits one to add as many parameters as desired to the bias prescription, so that the effective bias agrees with the simulation results well. That our simple model returns an estimate which is within a few percent of the direct measurement is a non-trivial self-consistency check that the simplest requirements (recall we have just two bias parameters) already capture most of the effects of bias. The agreement shown in Fig. 10 means that consistency relations have opened the door to using the scale dependence of bias to constrain halo formation physics. Encouraged by this result, we now study the consistency relation associated with the second variable: the slope.
3.5 Consistency relation for the slope
We will now study the consistency relation for the slope rather than the enclosed overdensity . A little algebra shows that the second of Eqs. 54 can be re-written as
[TABLE]
where is the slope variable normalized by its rms value, defined in Eq. 6, and is the enclosed overdensity defined in Eq. 55 evaluated on scale (also see Eq. 57), so is the original unnormalized slope (Eq. 58).
The first consistency relation states that times the sum of the bias factors is an estimator of . For similar reasons, it is interesting to view the second consistency relation as equating to on scale . We take this latter quantity from the direct measurements shown in Fig. 5; these are shown as the red curves in the upper panel of Fig. 11. The green symbols and curves show the Fourier-based estimator of Eq. 58. These clearly show steeper slopes for lower mass protohalos, which we asserted earlier was a generic prediction of peak theory. The blue curves show . The large scale bias factors have correctly estimated the trend for lower mass protohalos to have steeper slopes on the scale . The ratio between the consistency relation estimate and the direct measurement using is plotted in the lower panel of Fig. 11. The agreement between these estimates is roughly as good as it was for the enclosed overdensity (compare Fig. 10).
Figs. 10 and 11 are the main results of this paper. They show that the scale dependence of large-scale bias provides reliable estimates of both the density and its slope at the (smaller) Lagrangian radius – quantities which are expected to encode the small scale physics of halo formation. Note in particular that these estimates do not require a priori knowledge of the physics of collapse (spherical or not, deterministic or not), or the nature of the halo population (assembly biased or not).
3.6 Consistency relations revisited
Eqs. 57 and 58 provide a simple way to see why the consistency relations work. Start with and assume that Eq. 59 for is not just accurate, it is exact. Then
[TABLE]
which is the first of the consistency relations in Eq. 54. It is a simple matter to verify that if of Eq. 2.3 then the expression above for becomes the first of the consistency relations in Eq. 32. This shows that, if our model for is correct, then the consistency relation is a tautology.
A similar analysis of the slope, i.e., inserting Eq. 59 in Eq. 58, yields
[TABLE]
which is the second relation in Eq. 54. And if , then the expression above for becomes the third of the consistency relations in Eq. 32.
Now suppose we have only reliable knowledge about the window function for . Then it is useful to define , which is the same as the true window function up to , beyond which it is set to zero. Then although is still given by the true window function , the enclosed overdensity estimated using is
[TABLE]
where is computed using and . Similarly the consistency relation for the slope becomes
[TABLE]
where is also computed using . This shows that both consistency relations look like the original ones, with rescaled and , and values.
To see what this means, consider what happens when we estimate from the measured by fitting to . Suppose that the true scale dependent bias piece is . If we are unsure of the high- part of , then we cannot compute , and hence when fitting to determine . This means that when we fit, we are forced to fit for some using . Getting a good fit means and we are assuming for , the result of fitting well will be that . Blindly inserting this in the consistency relation yields Eq. 63.
Thus, provided we have fit well, and provided we use everywhere, the consistency relation will be satisfied. However, will not be the same as obtained with the true smoothing window. In addition, as we change , will change. As a result, will depend on the value of , until is close enough to 1 that the difference doesn’t matter. I.e., the difference between the true and does not matter for larger than this final . If , then we will truly have used the large scale information (on scale ) to constrain the density on scale . Note that the requirement is not really . Rather all we really need is to be close enough to . For , it may be that one can live with fairly small and still be OK. The requirement for the slope is slightly more stringent, since then we also need to have converged to .
The analysis above makes the point that the accuracy with which the consistency relation estimates the smoothed overdensity is directly related to how well our model for actually fits the true ; this depends in part on how well the assumed form for approximates the true . Although one might have thought that many parameters were required to achieve accurate results, Figs. 8–11 indicate that the simple two-parameter model, motivated by the excursion set approach, is good enough for reconstructing the enclosed overdensity and its slope.
It turns out to be straightforward to extend this analysis to higher order bias. Start with Eq.B2 in Musso et al. (2012), which states that, in a Gaussian random field,
[TABLE]
where is the matter overdensity at the position of the th halo, smoothed on scale with window , and is the probabilist’s Hermite polynomial, and the are th order bias coefficients. (Strictly speaking, they wrote the sum over halos which appears in the first equality above as . Since means something else here, we have simply written the sum explicitly. In addition, our expression corrects a few other typos in theirs.) Our notation highlights the fact that is a cross correlation between protohalo positions and Hermite-polynomial weightings of the smoothed density fluctuation field. In a model where only and matter,
[TABLE]
When then it is easy to see that the result of inserting Eq. 67 in Eq. 66 is
[TABLE]
this is the consistency relation statement that a binomial coefficient weighting of the scale-dependent bias factors yields the averaged Hermite-polynomial of the smoothed-overdensity centered on the protohalos. If we replace with , then the factors become but our final expression for is unchanged. In this case, is just of the previous subsections. It is easy to check that the relation for the slope also works out easily. And finally, if parameters matter, then the binomial weighting becomes a multinomial (Castorina & Sheth, 2017).
4 Conclusions
In the initial Lagrangian space, protohalo patches which are destined to evolve into halos at a later time must satisfy a number of constraints. Typically, these constraints involve the value of the smoothed density field as well as its derivatives with respect to position and scale. Each of these constraints leaves its mark on the spatial distribution of the protohalos: associated with each constraint is a hierarchy of Lagrangian bias parameters (linear, second order, etc.). We used a simple matrix algebra analysis to show that if there are constraints, there will be linear bias parameters which satisfy self-consistency relations (Section 2.1). To date, only one of these relations has been highlighted. Section 3.6 provided a rather different Fourier-space analysis of these relations. Both approaches show that these relations encode information about the constraints: We demonstrated this using measurements of the properties and clustering of protohalo patches in the initial conditions of numerical simulations.
Our consistency relations show that measurements of the scale-dependence of bias can be combined to estimate the critical overdensity required for halo formation. This estimate shows that the critical overdensity, smoothed over the protohalo patch scale, should be larger for less massive halos, and is in excellent agreement with direct measurements of the overdensity within the protohalo patches (Fig. 10). We used two estimators for the direct measurement: one is a closer to the traditional estimator, and is based on averaging in configuration space (Eq. 55). The other is a new estimator built from Fourier space measurements (Eq. 57). The two are in excellent agreement (Fig. 6). Our Fourier-based estimator provides a particularly transparent way to see why the consistency relations work so well (Eq. 3.6).
The scale-dependence of bias can also be used to estimate the slope of the overdensity around protohalos, evaluated on the scale of the patch. This predicted slope is steeper around less massive protohalos, and is again in excellent agreement with direct measurements of the slope (Fig. 11). As for the mean enclosed density, our direct measurements of the mean slope were also based on configuration- (Eq. 56) and Fourier-based (Eq. 58) methods, which agreed well (Fig. 7). And the Fourier-based estimator again provides a simple route to the consistency relations (Eq. 3.6).
Our formalism suggests that, although the values of the large scale bias factors may depend on the shape of the smoothing window, the combination which matters for the consistency relation does not. In this respect, detailed a priori knowledge of the shape and scale of the smoothing is not necessary for our methodology to work. Explicit measurements of this dependence confirm this expectation (Figs. 17 and 18).
Thus, our analysis has shown that the large-scale bias parameters, which are usually regarded as nuisance parameters in cosmological analyses, actually encode useful information about the small-scale physics of halo formation. The consistency relations we highlighted allow one to decode this information. Although we illustrated our approach using a simple model in which only two (sensibly chosen!) parameters matter, we showed that our analysis carries over, essentially unchanged, when there are more bias parameters. This is important because, as the required precision on our understanding of halo formation increases, we expect to encounter more and more constraints, and hence more bias parameters. Our analysis showed how to proceed when some of these parameters are nearly degenerate with others. Moreover, even in the two parameter case, our analysis is immune to assembly bias, in the sense that the same consistency relations can be used for any subset of the halo population, without any a priori information about the nature of the subset. Indeed, instead of breaking down in the presence of assembly bias, the consistency relations allow us to learn about the physics of assembly bias.
Our analysis leads to three interesting directions for further study. The first is to extend our linear algebra method, which is particularly simple for understanding the relations between linear bias parameters, to determine the consistency relations associated with higher order bias terms. (Our Fourier space approach was generalized to higher order bias in Section 3.6.) The second is to study how the consistency relations for the Lagrangian bias of protohalo patches which we have studied here are modified when one considers the Eulerian bias of evolved halos. Finally, our results clearly have implications for studies which exploit the relation between the scale independent linear bias parameter and halo mass, and which seek to relate the amplitude of to higher order bias parameters (, say, see e.g. Hoffmann et al. (2016)). Clarifying these issues is the subject of work in progress.
Acknowledgements
We thank the participants of the workshop “Statistics of Extrema in Large Scale-Structure” at the Lorentz center for helpful feedback in March 2016, and V. Desjacques for his hospitality in Geneva and the ICTP for its hospitality in Trieste during the summers of 2015 and 2016 where some of this work was completed. We thank the LasDamas project111http://lss.phy.vanderbilt.edu/lasdamas for the simulation outputs used in this work. The simulations were run using a Teragrid allocation; some RPI and NYU computing resources were also used. KCC acknowledges the support from the Swiss National Science Foundation and the Spanish Ministerio de Economia y Competitividad grant ESP2013-48274-C3-1-P. Finally, we thank the referee for a helpful report.
Appendix A Cross correlations in configuration space
The main text uses bias parameters estimated from protohalo-mass cross correlations in Fourier space. An interesting by-product of our analysis is an understanding of the same cross correlations in configuration space. We first use a simple analytically tractable case to illustrate how the choice of smoothing filter is expected to impact our analysis. We then show configuration space measurements from our simulations, and illustrate how well they are reproduced by our Fourier space analysis. This comparison has no free parameters, so any disagreement is a consequence of systematics. Before illustrating how the choice of smoothing window affects our use of consistency relations in practice, we use our cross-correlation measurements to illustrate a point which was recently made by Castorina et al. (2016a): that suitably normalized cross-correlations with the slope of the large scale fluctuation field should yield the same large scale bias as cross-correlations with the density itself. Finally, we show that our use of consistency relations does not require detailed knowledge of the shape of the smoothing window.
A.1 Dependence on smoothing window: Theory
Suppose that the power spectrum is a power law
[TABLE]
See Musso & Sheth (2014) for more discussion of why this is an interesting choice in the cosmological context. We set by matching the CDM at . The dark matter correlation function is
[TABLE]
In the upcrossing approximation of the excursion set approach, the protohalo-mass cross power spectrum is
[TABLE]
where is the smoothing scale of the window associated with defining protohalo patches, and is given by Eq. 5.
For the top-hat window (), the second term can be simplified by noting that
[TABLE]
because for Eq. 69. Setting in Eq. 71 and inverse Fourier transforming yields
[TABLE]
If we define , then
[TABLE]
Remarkably, for , the ratio is constant and equal to . I.e., the configuration space bias is independent of scale for all ; it becomes scale dependent only when .
In contrast, for Gaussian smoothing,
[TABLE]
where is the smoothing scale for the Gaussian window. We set to match and to lowest order in . Since
[TABLE]
we have
[TABLE]
where . Clearly, for this filter, is scale dependent on all scales. This is also true for (Eq. 16) which we used extensively in the main text (and which we treat numerically).
Fig. 12 compares the cross bias parameter
[TABLE]
for , and , when , , and , which correspond to halos of mass at . In all cases, the bias approaches on large scales (). However, the contribution to from the -part drops rapidly as , while that from peaks sharply around . Therefore the bump in is mainly driven by the term which is proportional to . The sharpness of this bump depends on the smoothing window: it is least sharp for the Gaussian filter because is substantially less compact than .
The other quantity which played an important role in the main text is the slope of the profile, , where is given by Eq. 55. For the tophat window
[TABLE]
where , while for the Gaussian window
[TABLE]
with . Fig. 13 shows for the three smoothing windows, using the same parameters as for Fig. 12. All three curves show a peak near ( for Gaussian). However, is narrower and cuspier than . The curve for , with a narrow but rounded peak, is otherwise rather similar to that for . More importantly, it is qualitatively similar to those shown in Fig. 5 of the main text.
A.2 Lagrangian cross bias in configuration space
The main text made extensive use of the bias parameters estimated from fitting to Fourier space measurements. In principle, we could have fitted to configuration-space measurements instead. Rather than doing so here, we instead show that the Fourier-space analysis is able to provide a good description of our configuration-space measurements.
Fig. 4 shows the configuration space cross-correlations for the same halo populations shown in Fig. 8. From these, we defined the configuration space cross-bias parameter
[TABLE]
where is the linear theory correlation function of dark matter at and is the redshift at which the Eulerian halos which we used to define the Lagrangian protohalos, were identified. Comparison with Eq. 53 shows that is defined similarly to . Note that is computed, not from the particle distribution, but by performing the integral in Eq. 70 over the linear theory power spectrum .
The symbols in Fig. 14 show our measurements of . They are well described by the smooth curves which are not fits; rather, they show the predicted shape which we obtained by setting
[TABLE]
where is given by Eq. 59, with , and taken from Fig. 9 (i.e., from fitting to the cross-spectrum bias shown in Fig. 8), and then inserting this, instead of the measured in Eq. 81. Except for the most highly biased objects, these curves agree with the measurements reasonably well. This demonstrates that our analysis does not suffer from large systematic effects associated with transforming between configuration and Fourier space.
To examine the predictions in more detail, Fig. 15 shows the - and -contributions (dashed and dotted) for two mass bins (as labelled) at . For , the apparently constant part in fact has contributions of opposite signs, which cancel to yield the large-scale constant value. (Fig. 14 suggests that high mass halos may suffer larger systematics at large than the low mass ones as deviations from the measurements are more apparent. However, it is the fractional deviation which matters, and these are comparable to or smaller than for the low mass halos.) For , both components are positive. As for the simple example shown in Fig. 12, the bump is primarily driven by the sharp rise of the -component at .
A.3 Bias from correlating with large scale slope rather than density
The main text defines of Eq. 59 as the ratio of the Fourier transform of to , where is the unsmoothed dark matter fluctuation at distance from the protohalo. If denotes smoothed with a filter , then the Fourier transform of equals , and the Fourier transform of divided by also equals of Eq. 59.
With this in mind, consider Eq. 55 for . Writing as the Fourier transform of , and rearranging the order of the integrals over and yields
[TABLE]
Therefore, the slope variable
[TABLE]
Hence, if we define and as the values of the expressions above when , then we expect the ratio
[TABLE]
to give on large scales. This is a special case of a more general point made by Castorina et al. (2016a) that, for any which is linearly proportional to , the Fourier space ratio .
Fig. 16 compares with using the same measurements as in Fig. 14. We find that indeed approaches on large scales, so that on large scales, in agreement with the assertion of Castorina et al. (2016a).
A.4 Dependence on smoothing window: Practice
We noted in the main text that, although the correlations between variables which our methodology exploits may depend on a smoothing window , it does not assume any particular functional form for . Our use of in the main text was motivated by the fact that, with it, one obtains a good description of the cross-correlations between protohalos and the matter. To illustrate that our methodology is not tied to this functional form, we now show the result of using instead. Provided that using in Eq. 59 does provide a good description of the curves shown in Fig. 8, our methodology asserts that , , and may all change, but the consistency relation between them, Eq. 30, will still apply.
In practice, because only at or so, we already know that a model based on will not be optimal, especially for determining . However, if we determine and from fitting to only, then it is possible that everything will work as well as it did for . In particular, since both and on large scales, we expect . If we set both and equal to , we expect at low . We can do it more accurately by matching obtained using with that from up to second order. These changes, along with the fact that , obviously impact one side of the consistency relation. On the other hand, either. In fact from Fig. 6, the threshold measured using the effective window is systematically lower than that from the top-hat. So it is possible that this change to the rhs of the consistency relation compensates to a large extent.
Fig. 17 shows that this is indeed what happens. Blue circles and red triangles compare the two sides of the first equation in Eq. 30 for and . Note that this way of presenting the consistency relation test is different from that in Fig. 10, since there we also wanted to show the mass dependence of the enclosed overdensity, whereas here the object of interest is the consistency relation itself. Both sets of symbols in Fig. 17 lie close to the one-to-one line, indicating that the consistency relation applies both for and for .
The agreement is slightly worse for , presumably because using in Eq. 59 is simply a better model for the cross bias (Chan et al., 2015). (For , we have actually refit , and , rather than rescaling as described above. The resulting fractional deviation in Fig. 17 is of the same order as from .) Fig. 18 shows a similar analysis of the consistency relation for the slope. In this case, fares substantially better; presumably this is because, as Fig. 13 shows, results in a cuspy signal around , so it changes significantly if the appropriate scale differs even slightly from , which can happen if is not a perfect fit to .
Recently, Modi et al. (2016) report the results of a similar test of the consistency relation for based on . However, they presented their results in a format which makes it difficult to assess ten percent discrepancies. Our Fig. 19 shows our results in the format they used. The agreement between the direct and large-scale structure measurements is impressive; however, the larger dynamic range here does not do justice to the level of agreement which our preferred formats, Figs. 10 and 17, show. Unfortunately, they do not show results for the slope. Nevertheless, the agreement between our analyses for is reassuring, as it indicates that detailed a priori knowledge of the shape of the smoothing window is not crucial for reconstructing from large scale bias – in agreement with the arguments given in the main text.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Appel & Jones (1990) Appel L., Jones B. J. T., 1990, MNRAS, 245, 522
- 2Bardeen et al. (1986) Bardeen J. M., Bond J. R., Kaiser N., Szalay A. S., 1986, Ap J, 304, 15
- 3Biagetti et al. (2014) Biagetti M., Chan K. C., Desjacques V., Paranjape A., 2014, MNRAS, 441, 1457
- 4Bond et al. (1991) Bond J. R., Cole S., Efstathiou G., Kaiser N., 1991, Ap J, 379, 440
- 5Castorina & Sheth (2017) Castorina E., Sheth R. K., 2017, Unequal correlations in the multi-dimensional excursion set approach, in preparation
- 6Castorina et al. (2016 a) Castorina E., Paranjape A., Sheth R. K., 2016 a, Constraints on halo formation from cross-correlations with correlated variables ( ar Xiv:1611.03613 )
- 7Castorina et al. (2016 b) Castorina E., Paranjape A., Hahn O., Sheth R. K., 2016 b, Excursion set peaks: the role of shear ( ar Xiv:1611.03619 )
- 8Chan et al. (2012) Chan K. C., Scoccimarro R., Sheth R. K., 2012, Phys. Rev. D, 85, 083509
