The impact of non-Gaussianity on the error covariance for observations of the Epoch of Reionization 21-cm power spectrum
Abinash Kumar Shaw, Somnath Bharadwaj, Rajesh Mondal

TL;DR
This paper demonstrates that non-Gaussianity significantly increases error estimates and introduces correlations in the 21-cm power spectrum measurements of the Epoch of Reionization, impacting future observational strategies.
Contribution
It develops a methodology to incorporate non-Gaussian effects into error predictions for 21-cm observations with radio interferometers like SKA-Low.
Findings
Non-Gaussianity causes 40-200% increase in error estimates at certain scales.
Errors become correlated and anticorrelated across different k-bins due to non-Gaussianity.
Non-Gaussian effects remain significant even when considering foreground removal.
Abstract
Recent simulations show the Epoch of Reionization (EoR) 21-cm signal to be inherently non-Gaussian whereby the error covariance matrix of the 21-cm power spectrum (PS) contains a trispectrum contribution that would be absent if the signal were Gaussian. Using the binned power spectrum and trispectrum from simulations, here we present a methodology for incorporating these with the baseline distribution and system noise to make error predictions for observations with any radio-interferometric array. Here we consider the upcoming SKA-Low. Non-Gaussianity enhances the errors introducing a positive deviation relative to the Gaussian predictions. increases with observation time and saturates as the errors approach the cosmic variance. Considering hours where a detection is possible at all redshifts $7 \le z \le…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
The impact of non-Gaussianity on the error covariance for observations of the Epoch of Reionization 21-cm power spectrum
Abinash Kumar Shaw1,2, Somnath Bharadwaj1,2 and Rajesh Mondal3
1Department of Physics, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
2Centre for Theoretical Studies, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
3Astronomy Centre, Department of Physics and Astronomy, University of Sussex, Brighton BN1 9QH, UK E-mail:[email protected]
(Accepted: 2019 May 31; Revised: 2019 May 27; Received: 2019 February 22)
Abstract
Recent simulations show the Epoch of Reionization (EoR) 21-cm signal to be inherently non-Gaussian whereby the error covariance matrix of the 21-cm power spectrum (PS) contains a trispectrum contribution that would be absent if the signal were Gaussian. Using the binned power spectrum and trispectrum from simulations, here we present a methodology for incorporating these with the baseline distribution and system noise to make error predictions for observations with any radio-interferometric array. Here we consider the upcoming SKA-Low. Non-Gaussianity enhances the errors introducing a positive deviation relative to the Gaussian predictions. increases with observation time and saturates as the errors approach the cosmic variance. Considering hours where a detection is possible at all redshifts , in the absence of foregrounds we find that the deviations are important at small where we have at for some of the redshifts and also at intermediate where we have at . Non-Gaussianity also introduces correlations between the errors in different bins, and we find both correlations and anticorrelations with the correlation coefficient value spanning . Incorporating the foreground wedge, continues to be important () at . We conclude that non-Gaussianity makes a significant contribution to the errors and this is important in the context of the future instruments that aim to achieve high-sensitivity measurements of the EoR 21-cm PS.
keywords:
cosmology: reionization, first stars, large-scale structure of universe, diffuse radiation, methods: statistical, technique: interferometric
††pubyear: 2019††pagerange: The impact of non-Gaussianity on the error covariance for observations of the Epoch of Reionization 21-cm power spectrum–References
1 Introduction
The Epoch of Reionization (EoR) is an important but poorly understood milestone in the cosmic history when the hydrogen in the universe underwent a transition from neutral (H i) to ionized (H ii) phase. Our current knowledge of the EoR comes from several indirect observations. The measurements of the Thomson scattering optical depth (Planck Collaboration et al., 2016a, b) of the cosmic microwave background radiation (CMBR) with the free electrons in the intergalactic medium (IGM) suggests that the universe was ionized at less than level at redshifts above . Measurements of the high-redshift quasar spectra (Becker et al., 2001; Fan et al., 2002, 2006; Gallerani et al., 2006; Becker et al., 2015) show a complete Gunn–Peterson trough and also measurements of the Gunn–Peterson optical depth suggest that the reionization was over by . Recent studies of the Ly- emitters (LAE) show a rapid decline in the luminosity function at (Ouchi et al., 2010; Jensen et al., 2014; Konno et al., 2014; Faisst et al., 2014; Santos et al., 2016; Ota et al., 2017; Zheng et al., 2017) which suggests a rapid increase in the H i density in the IGM and a patchy H i distribution at those redshifts. These indirect observations together suggest the reionization to occur within a redshift range (Mitra et al., 2013; Robertson et al., 2013; Mitra et al., 2015; Robertson et al., 2015; Mondal et al., 2016; Dai et al., 2019). However such indirect observations are not adequate to address many fundamental issues related to the EoR such as the exact duration and timing, the properties of the ionizing sources and the topology of H i distribution.
Observations of the redshifted 21-cm radiation due to the hyperfine transition of H i is a promising probe to study the high-redshift universe (Sunyaev & Zeldovich, 1972; Scott & Rees, 1990). The low-frequency radio interferometers will measure brightness temperature fluctuations of the EoR 21-cm radiation (Bharadwaj & Sethi, 2001; Bharadwaj & Ali, 2005). A substantial effort is currently underway to measure the EoR 21-cm signal using the first-generation radio interferometers e.g. GMRT111http://www.gmrt.ncra.tifr.res.in (Paciga et al., 2013), MWA222http://www.haystack.mit.edu/ast/arrays/mwa (Jacobs et al., 2016), LOFAR333http://www.lofar.org (Yatawatta, S. et al., 2013), PAPER444http://eor.berkeley.edu (Ali et al., 2015) and the second-generation interferometers such as HERA555http://reionization.org (Pober et al., 2014; Ewall-Wice et al., 2016) and the upcoming gigantic SKA666http://www.skatelescope.org (Koopmans et al., 2015). These experiments aim to measure the EoR 21-cm power spectrum (PS) (Bharadwaj & Ali, 2004). The expected EoR 21-cm signal is about orders of magnitude weaker compared to the galactic and extragalactic foregrounds (Ali et al., 2008; Bernardi, G. et al., 2009, 2010; Ghosh et al., 2012; Paciga et al., 2013; Beardsley et al., 2016). The foregrounds, together with the system noise and other calibration errors, pose a huge challenge for the measurement of the EoR 21-cm PS. Only weak upper limits on the EoR 21-cm PS have been estimated till date (McGreer et al., 2011; Parsons et al., 2014; Pober et al., 2016a). In addition to the PS, various other statistics such as the variance (Patil et al., 2014), bispectrum (Yoshiura et al., 2015; Shimabukuro et al., 2017; Majumdar et al., 2018) and the Minkowski Functional (Kapahtia et al., 2018; Bag et al., 2018) have been proposed to quantify the EoR 21-cm signal .
In the recent past, several works have made quantitative predictions of the sensitivity for measuring the EoR 21-cm PS (Morales & Hewitt, 2004). McQuinn et al. (2006) have made predictions for hours of observations with the MWA, LOFAR and the upcoming SKA-Low. Beardsley et al. (2013) have estimated that MWA is capable of detecting the EoR 21-cm signal at level with hours of observations. Zaroubi et al. (2012) have made quantitative predictions for sensitivity of LOFAR considering hours of observations, and Jensen et al. (2013) have predicted that LOFAR will be able to detect the EoR 21-cm PS at with hours of observations. Parsons et al. (2012) have predicted that the EoR 21-cm signal can be detected at with PAPER in months of observations. The results of Pober et al. (2014) suggest that the upcoming HERA will be able to detect the EoR 21-cm PS at a level within the range assuming a moderate foreground model. Ewall-Wice et al. (2016) have studied the prospects of detecting the EoR 21-cm PS with HERA incorporating X-ray heating of the IGM.
The upcoming SKA-Low, to be located in Australia, will be the most sensitive radio telescope to be built. It will have stations, each of which combines the signal from several constituent log periodic dipole antennas. Each of these station is planned to be in diameter. The telescope will operate within a frequency band of and it will have field of view. The interferometer will have a compact core and spiral arms which will extend up to a large distance such that maximum antenna separation is . A recent study by Mellema et al. (2013) has quantified the prospects of detecting the EoR 21-cm PS with SKA-Low. The authors have predicted the errors in the measured EoR 21-cm PS at three different redshifts . In this analysis they have varied the number of core antennas and also the core radius. The analysis incorporates the system noise assuming hours of observation with a bandwidth of . They find that it will be possible to achieve a maximum SNR of at for all the three redshifts. They also find that the predictions for SKA-Low show a significant improvement in comparison with other precursor telescopes such as MWA, LOFAR and PAPER (figs 21 and 22 of Mellema et al., 2013).
All the existing predictions for detecting the EoR 21-cm PS have assumed the signal to be a Gaussian random field. This assumption plays a crucial role in making the predictions. The PS completely specifies the statistical properties of the signal for a Gaussian random field, and this assumption allows the signal in each Fourier mode to be treated as being independent. Gaussianity is possibly a good assumption during the early stages of EoR, and also when one observes very large length-scales. However, the growth and subsequent overlapping of the H ii regions make the signal highly non-Gaussian as reionization progresses (Bharadwaj & Pandey, 2005). The PS no longer quantifies the entire statistical properties of the signal as the signal in different Fourier modes are correlated. Higher order statistics like the bispectrum (Majumdar et al., 2018) and trispectrum are needed to quantify these correlations. This also affects the error predictions for the PS. Considering only cosmic variance (CV) that is inherent to the signal, Mondal et al. (2015) have studied the effects of non-Gaussianity on the error predictions for the EoR 21-cm PS. For a Gaussian random field, the SNR for the 21-cm PS is expected to increase as the square root of the number of independent Fourier modes. However, Mondal et al. (2015) find that as a consequence of the non-Gaussianity the SNR saturates at a limiting value [SNR]l beyond which it does not increase any further. The value of [SNR]l was also found to decreases with the progress of reionization that corresponds to an increase in the non-Gaussianity. Two subsequent papers (Mondal et al., 2016, 2017) have quantified the error covariance for the binned PS, which now has an extra contribution from the trispectrum as compared to the Gaussian situation where the error covariance can be expressed entirely in terms of the PS. In these papers they have developed a unique statistical technique for estimating the bin-averaged trispectrum from the PS error covariance. They have used an ensemble of seminumerical EoR simulations to estimate the error covariance and the trispectrum at several redshifts in the range . The trispectrum contribution is found to increase significantly as reionization progresses. The non-Gaussianity is found to result in larger error estimates compared to the Gaussian predictions. Non-Gaussianity also introduces correlations between the PS error estimates at different bins.
In this paper, we predict the prospects of measuring the EoR 21-cm PS using observations with the upcoming SKA-Low. To this end we study the error covariance of the EoR 21-cm PS that will be measured by SKA-Low. Unlike the previous works (e.g. Mellema et al. 2013), our analysis incorporates the inherent non-Gaussian nature of the signal. We have used the EoR 21-cm PS and trispectrum from the simulations of Mondal et al. (2017). We include the system noise contribution to calculate the full PS error covariance for the current proposed configuration of SKA-Low777SKA1_LowConfigurationCoordinates-1.pdf. The analysis in this paper also incorporates the impact of foregrounds considering the EoR 21-cm signal to be free of other possible calibration errors.
The structure of this paper is as follows. Section 2 briefly describes the simulations and the techniques used in Mondal et al. (2017) to obtain the EoR 21-cm PS and trispectrum. Section 3 briefly presents the SKA-Low configuration and discusses how to combine the observed visibility data for an optimal estimate of the EoR 21-cm PS. We also present a framework to compute the EoR 21-cm PS error covariance. Section 4 presents the results considering no foregrounds. In Section 5 we study the effects of foregrounds and finally summarize and discuss our findings in Section 6. In keeping with the simulations of Mondal et al. (2017), we have used the Planck+WP (Planck Collaboration et al., 2014) best-fitting cosmological parameters throughout this paper.
2 Simulating The EoR 21-cm Signal
We have simulated the EoR 21-cm signal at six different redshifts using a seminumerical technique (Majumdar et al. 2013; Mondal et al. 2015) that comprises three major steps. First, we generate the dark matter distributions at the aforementioned redshifts using a publicly available particle mesh N-body code888https://github.com/rajeshmondal18/N-body (Bharadwaj & Srikant, 2004). We have simulated the dark matter distributions within a cube of comoving volume with a grid size of and a mass resolution of . Next, we identify the dark matter halos within the matter distribution using a publicly available halo finder999https://github.com/rajeshmondal18/FoF-Halo-finder based on the Friends-of-Friend (FoF) algorithm (Davis et al., 1985) with a linking length times the mean inter-particle spacing and a minimum halo mass of which corresponds to simulation particles. In the final step we generate the reionization map using a publicly available seminumerical code101010https://github.com/rajeshmondal18/ReionYuga following the formalism adopted by Choudhury et al. (2009). We assume that the hydrogen traces the dark matter, and the haloes with masses exceeding a minimum halo mass () host the ionizing sources, the number of ionizing photons emitted by a source being proportional to the host halo mass through a dimensionless constant of proportionality , which incorporates a large number of unknown parameters like the star formation efficiency and the UV photon escape fraction.
The hydrogen and photon densities are, respectively, smoothed over spheres of radius . Any grid point within the simulation is considered to be completely ionized if the smoothed photon density exceeds the smoothed hydrogen density, the smoothing radius is allowed to vary from one grid spacing to a maximum value of . The resulting H i distribution is mapped to redshift space using the prescription of Majumdar et al. (2013) to generate the final 21-cm brightness temperature distribution on a grid eight times coarser than the N-body simulation. The simulations used here are exactly the same as those that were used in Mondal et al. (2016, 2017) and the reader is referred to there for further details. There simulations have three free parameters namely the minimum halo mass, the ionizing efficiency and the mean free path of the ionizing photons. We have used the values , and (Songaila & Cowie, 2010) to obtain a reionization history where the mean mass averaged neutral fraction has a value at and is over by . The integrated Thomson scattering optical depth obtained using these parameter values, , is also consistent with the observations (Planck Collaboration et al., 2016a) where .
3 Power Spectrum Error Covariance
We quantify the statistics of the EoR 21-cm brightness temperature fluctuations using the power spectrum (PS) which is defined as . Here is the simulation (observational) volume, is the Fourier transform of the brightness temperature fluctuations and k is a wave vector. In the absence of foregrounds and calibration errors, the brightness temperature fluctuations recorded by a radio interferometer is which is a sum of the 21-cm signal and the system noise contribution . The PS corresponding to therefore is a sum of and which is the system noise PS i.e. . We have used the simulations described in Section 2 to predict the EoR 21-cm PS . In this work we make predictions for the upcoming SKA-Low7, and we have used the specification described in the subsequent paragraph to compute the noise PS . We have considered the upcoming SKA-Low to be an array of stations7, each of which is a station of diameter m. The instrument will operate within a frequency range of which will probe the H i 21-cm signal between and . The EoR 21-cm signal evolves significantly along the line of sight (LoS) and observations at different redshifts will probe the signal at different stages of reionization due to the light-cone effect (Datta et al., 2012, 2014). As a consequence, the signal no longer remains ergodic along the LoS and there is a significant loss of information if the entire frequency band is used to estimate the PS (Mondal et al., 2018; Mondal et al., 2019). In the present work we have avoided this by restricting the analysis to six different redshift slices each of width centred at redshifts . We have also assumed that the entire frequency bandwidth is divided into frequency channels of width . Note that the antenna layout, the number of antennas and the channel width assumed here are only representative values, and may change in the final implementation of the telescope.
The analysis in this paper considers an observation tracking a field at declination DEC using SKA-Low for hours with -second integration time. The -second integration time has been chosen here to keep the simulated baseline data volume small. However, the purpose of simulating the array baseline configuration here is to primarily estimate , and we find that the noise predictions do not show any noticeable change even when the integration time is reduced to seconds or to seconds. Considering d to be the projection of the antenna separation on the plane perpendicular to the LoS, we use with being the wavelength that corresponds to the central frequency of a slice. The subsequent analysis is restricted to the baselines U corresponding to the antenna separations as the baseline distribution falls off rapidly at larger values of d. The simulated observations provide us the baselines and frequency channels at which the signal will be measured. We use and with where is the comoving distance to the centre of a redshift slice, r_{c}^{\prime}=\partial r/\partial\nu\big{|}_{\nu=\nu_{c}}, is the frequency bandwidth of the redshift slice and . Note that is the Fourier conjugate of . The simulations provide us with a set of comoving vectors at which we will obtain measurements of the brightness temperature fluctuations . Two different baselines having separation less than do not have independent information due to overlap of the antenna beam pattern (Bharadwaj & Ali, 2005). We grid the comoving wave vectors with a grid of size and . Considering a grid point , we define to be the number of measurements that lie within a voxel centred at . We use to estimate the noise PS at each grid point using the following expression (Chatterjee & Bharadwaj, 2018):
[TABLE]
Here is the system temperature, is the number of polarizations, is the number of observed nights with hours per night, is the integration time, is the geometric area of a single antenna. It is convenient to quantify the total duration of the observations using hours instead of , and we have used through the subsequent discussion of this paper. The system temperature is a sum of the sky temperature (Fixsen et al., 2011) and the receiver temperature . Here is defined using
[TABLE]
where is the telescope’s primary beam pattern (Sarkar & Bharadwaj, 2013; Parsons et al., 2014). We have approximated the beam pattern with a Gaussian (Choudhuri et al., 2014) and evaluated the solid angle integral in the flat sky approximation to obtain . Note that is infinitely large at the grid points where i.e. the grid points that are not sampled by the telescope baseline distribution.
Considering a typical SKA-Low observation spanning an angular extent of on the sky with an angular resolution and a frequency bandwidth of with frequency resolution , this corresponds to different grid points at which the EoR 21-cm PS will be measured. The dimension of the resulting PS error covariance matrix is which renders further computations prohibitively expensive if not impossible. In order to overcome the intractability of such a large covariance matrix, we bin the k space and use the binned PS estimator that, for the -th bin, is defined as
[TABLE]
where the sum is over the modes within the -th bin and is the normalized weight associated with each mode with . Here is the average value corresponding to the -th bin. The weights have been introduced to account for the fact that the ratio varies across the different grid points, and as discussed later, the weights have been chosen so as to maximize the SNR of the bin-averaged PS. For the present analysis we have divided the available k space into logarithmic spherical bins. The ensemble average of gives the bin-averaged PS . Note that the resulting estimate has a noise bias , this however can be eliminated by suitably modifying the estimator (Choudhuri et al., 2016b). In the subsequent analysis we assume that the noise bias has been eliminated and we have an unbiased estimate of the bin-averaged power spectrum . The noise contribution to the PS error covariance , however, cannot be eliminated and following the calculation presented in Mondal et al. (2016), we have
[TABLE]
where the sum is over the grids points and in the -th and the -th bins respectively. The trispectrum originates due to non-Gaussianity of the EoR 21-cm signal, the quantity that appears here is the weighted bin-averaged trispectrum. For the diagonal terms of the covariance matrix the trispectrum quantifies the excess with respect to the Gaussian predictions. The off-diagonal terms of are predicted to be zero if the EoR 21-cm signal were a Gaussian random field. The trispectrum arising due to the non-Gaussianity introduce non-zero off-diagonal terms corresponding to correlations (and anticorrelations) between the errors in the PS estimates in the different bins (Mondal et al., 2016, 2017). The system noise has been considered to be outcome of a Gaussian random process and this does not contribute to the non-Gaussianity through the trispectrum.
3.1 Computing the Error Covariance from the Simulations
The PS error covariance consists of two components : (1) the cosmic variance (CV), and (2) the system noise. According to equation (4), we need the EoR 21-cm PS , the EoR 21-cm trispectrum , the noise PS and appropriate weights to compute the . The reionization simulations of Mondal et al. (2017) provide us the bin-averaged EoR 21-cm PS
[TABLE]
and the bin-averaged trispectrum
[TABLE]
where the sum in equation (6) is over the grid points ( modes) in the -th and -th bins, and the and are numbers of grid points in the respective bins. The bins that we have chosen to analyse the simulated SKA-Low observations have exactly the same boundaries as the bins used to analyse the EoR simulations in Mondal et al. (2017), however we cannot directly use the and from Mondal et al. (2017) in equations (3) and (4) to predict the PS error covariance for the SKA-Low observations. First, equations (5) and (6) assume uniform weights, whereas it is necessary to consider the variation of across the grid points to account for the non-uniform sampling when considering the simulated observations (equations 3 and 4). Further, the resolution of the simulations and the observations will, in general, be different and consequently the k grid spacing will also differ.
One can attempt to estimate the ensemble averages of at every individual grid point and at every pair of grid points, however these estimates will be extremely noisy due to the limited number of statistically independent realizations in the EoR 21-cm signal ensemble (e.g. in Mondal et al. 2017). Further, we have an enormous volume of the trispectrum data that renders this approach unfeasible. The issue now is to predict the bin-averaged PS (equation 3) and its error covariance (equation 4) for the SKA-Low observations using the results (equations 5 and 6) from the simulations of Mondal et al. (2017).
Here we have assumed that the EoR 21-cm PS does not vary much across the grid points within a bin (say the -th bin), and in equations (3) and (4) we have used the simulated from Mondal et al. (2017) to calculate for all the grid points in the -th bin. The value of in equation (4) depends on the magnitude and direction of the two vectors and , and both of these can vary widely even when the two vectors are in the same bin (). An even wider variation is possible when the two vectors are in two different bins and . Unfortunately this information is not available in (equation 6) evaluated from the simulation of Mondal et al. (2017). Here we have considered two different assumptions regarding the trispectrum at two different modes and . These two assumptions correspond to two extreme cases. Case–I: we assume that all the modes within a bin are equally correlated i.e. when both and are in the -th bin, and the correlation between modes in two different bins does not depend on the magnitude or orientation of the individual vectors i.e. when and are in the -th and -th bins, respectively. Case–II: we assume that the signal in two different Fourier modes is uncorrelated unless i.e. when the mode is in the -th bin. Case–I corresponds to the situation in which we have the maximum possible correlation between different modes whereas Case–II corresponds to the situation in which we have the minimum possible correlation between two different modes. In reality we expect the correlation between two modes to vary with the separation between the two modes, and the result is expected to lie within the two extreme cases considered here. Considering equation (6), we obtain for Case–I whereas it predicts for Case–II. Note that Case–II predicts the error covariance to be completely diagonal with all the off-diagonal terms being zero which is inconsistent with the findings of Mondal et al. (2016). While Case–II is unrealistic for the off-diagonal elements of the covariance matrix, we still consider its predictions for the diagonal elements in order to illustrate the effect of partial decorrelation in the value of the trispectrum across different modes.
We calculate the weights separately for both the cases by extremizing the SNR with respect to . Considering Case–I the unnormalized weights that extremizes the SNR are
[TABLE]
which have in the denominator, i.e. the grid points with higher noise contribute less to the bin averaged quantities. The grid points , which are unsampled during observations, i.e. , have (equation 1). The weight (equation 7) for the unsampled grid points and they do not contribute to the bin averaged quantities. Using equation (7) in equation (4), we obtain the corresponding PS error covariance matrix
[TABLE]
For comparison we consider the error covariance for a situation where the signal is a Gaussian random field for which the trispectrum is zero. The weights here are unchanged and these are given by equation (7), and we have the PS error covariance matrix
[TABLE]
The diagonal terms of the covariance matrices (equations 8 and 9) predict the error variance in the measured EoR 21-cm PS, i.e. . Equations (8) and (9) indicate that the Gaussian consideration underestimates the variance of the measured PS. The off-diagonal terms of the covariance matrix () predict the correlation between the errors at the -th and -th bins . The off-diagonal terms are zero for a Gaussian random field, and the errors in the different bins are uncorrelated. Non-Gaussianity however may introduce correlations between the different bins through the off-diagonal components of the trispectrum.
We first discuss the diagonal terms , i.e. the variance. This has contributions from the CV as well as the system noise. The noise PS scales as (equation 1) and this has a large value for small observation times. Considering the behaviour of , for small observation times this is governed by the system noise contribution and we have
[TABLE]
Equation (10) shows that and consequently SNR for small observation times. The observations with very large elucidate another extreme of the error estimates (equation 8) where , and converges to the ‘CV’ that is given by
[TABLE]
where is the number of sampled grid points in the -th bin. The CV represents the lower limit for the PS error variance. This arises due to the inherent statistical uncertainty in the EoR 21-cm signal. The actual predicted error variance for a finite observing time will typically be larger than this due to the system noise contribution.
The corresponding cosmic variance for a Gaussian random field (equation 9) is given by
[TABLE]
A comparison of equations (11) and (12) illustrates an important difference between the Gaussian and non-Gaussian situations. We see that it is possible to reduce the CV with no lower bound by combining the signal from a larger number of k modes in the bin, i.e. increasing . In contrast, the presence of the trispectrum in equation (11) sets a lower limit to the value of , and it is not possible to lower the variance any further by increasing the number of k modes (Mondal et al., 2015).
Next considering the off-diagonal terms (equation 8) which quantify the correlation between different bins, we see that this only depends on the trispectrum. This is intrinsic to the signal, and therefore is independent of the system noise and observation time.
Considering Case–II, the unnormalized weights are given by
[TABLE]
which differ from the weight in Case–I (equation 7). The weights now include a contribution from the trispectra for the non-Gaussian signal. Here also the weights are zero for the grid points that are not sampled by the baseline distribution. The weights for Case–II match those for Case–I (equation 7) if the signal were a Gaussian random field. The PS error covariance (using equations 4 and 13) in Case–II is given by
[TABLE]
Note that Case–II does not take into account the correlation between the different k grid points that makes the off-diagonal terms of the covariance matrix to be zero. The error covariance for Cases I and II match for small observation times, and they have very similar forms for very long observation times (CV) where for Case–II we have
[TABLE]
This differs from the predictions for Case–I (equation 11) by the factor , which appears in equation (15). In our analysis we find that has values in the range for and over the rest of the range considered here. We see that the error predictions for Case–II are smaller than those for Case–I. The error predictions for Case–II are expected to lie somewhere in between the Gaussian predictions and Case–I which assumes that all the k modes in a bin are equally correlated.
We have used the resulting covariance matrices (equations 8, 9 and 14) to predict the errors for PS measurements in the different redshift slices introduced earlier in this section.
4 Results
Figure 1 shows the dimensionless EoR 21-cm PS (solid purple line) and the corresponding error estimates for Case–I. The solid lines represent the non-Gaussian error predictions (equations 8) and the dashed lines represent the corresponding Gaussian error predictions (equation 9), both of these have been multiplied with to make them dimensionless. The error estimates have contributions from both the cosmic variance (CV) and the system noise. There are broadly two main features visible in Figure 1. (1) We see that the system noise contribution dominates the errors at large . These errors come down as is increased. The errors also come down at lower where the system noise contribution is smaller ( increases with redshift). For each and we can identify a largest mode () below which a detection of the 21-cm power spectrum will be possible. A larger range becomes accessible for a detection ( increases) as is increased or we move to a lower . This is studied in more detail in Figure 2, which we discuss later. (2) We see noticeable differences between and . These differences are most prominent for the CV predictions that correspond to the limit , where the system noise becomes insignificant. The system noise contribution is inherently Gaussian, whereas the 21-cm signal is non-Gaussian. We find that the values of and match for small when the system noise dominates the errors. The differences between and become noticeable as is increased. The differences are primarily noticeable at small where there is a relatively smaller system noise contribution as compared to large . The differences also become more pronounced as we move to lower , where there is a smaller system noise contribution. The differences between and are studied in detail in Figure 3, which we discuss later.
Considering Figure 1, we see that the predicted error estimates all increase with mainly due to the system noise contribution in contrast to the expected signal , which is relatively flat across the relevant range. This implies that for any given a detection of the signal will only be possible at small whereas the errors in the power spectrum will dominate at large . Figure 2 shows the largest mode , below which SKA-Low will be able to measure the EoR 21-cm PS at confidence. We show this as a function of for the four representative values of indicated in the figure. We see that the value of increases as decreases i.e. for a fixed observation time, we will progressively be able to probe a larger range of length-scales as reionization progresses. This is primarily a consequence of the fact that the system noise comes down at lower , further the amplitude of the 21-cm PS also increases as reionization progresses. However, the amplitude peaks at reionization and drops beyond this, causing to fall at . Considering hours we find that there is a limited range across which a detection of the 21-cm PS is possible. This is restricted to at high and increases somewhat to at and . There is a significant increase in the values of (by a factor of ) if is increased to hours. We see that with hours a detection will be possible in the range at . The value of increases gradually if is increased beyond hours. However, we see an exception at where there is a significant increase in if is increased beyond hours. The values of increases very slowly for hours and values are in the range for hours.
Figure 3 shows the deviation of the non-Gaussian error estimates with respect to the corresponding Gaussian estimates. These deviations arise due to the contribution from the trispectrum (equation 8). Earlier studies (Mondal et al., 2016, 2017) show that the trispectrum increases at larger (smaller length-scales), and it also increases as reionization proceeds i.e. decreases. These effects are reflected in the behaviour of the CV, which ignores the system noise. Considering the CV, we see that the deviations are minimum at around , and the deviations increase monotonically at both smaller and larger values. At the smallest bin () we find at and , whereas to for the other redshifts. The values of increase significantly at with deviations of order or larger at for the entire range. Considering the redshift evolution of CV, we see that at large the deviations from the Gaussian predictions increase as reionization proceeds.
We see that for the values of approach the CV limit within hours for and within hours for lower redshifts. We find that the bins at are largely system noise dominated, and the deviations at these bins are small for even for an observing time of hours. However, at we find that also increases at large () for hours and we have at for hours. These deviations increase significantly at , where at for hours. The range where increases further to if is increased further to hours.
We next consider how the SNR for the 21-cm PS grows with increasing observation time . Figures 4–6 show the results for three representative bins located at (large scales), (intermediate scales) and (small scales), respectively. The SNR values are shown for both Case–I (purple solid line) and Case–II (blue solid line), as well as the Gaussian predictions (dotted black line). The CV limits () are shown as shaded regions for both the non-Gaussian (Case–I) and Gaussian predictions. We find that the differences between Case–I, II and the Gaussian predictions are noticeable only when the SNR approaches the CV limit. The Gaussian predictions are the most optimistic of the three, and the SNR values for Case–II are typically between those for Case–I and the Gaussian predictions. The figure also shows how increases with at the specified values of .
Considering the lowest bin (; Figure 4), the SNR is largely constrained by the CV with a relatively small system noise contribution. The SNR saturates to the CV limit within a few hundred hours of observations at and within hours for . Considering Case–I, a measurement of the EoR 21-cm PS will be possible with hours at redshifts and with hours at , whereas a detection is limited by the CV at and . However, the Case–II predictions are more optimistic and they predict a detection to be possible. The deviations between the non-Gaussian and Gaussian predictions are found to become important within a few hundred hours of observations at redshifts .
Considering (Figure 5), the limiting SNR (CV) increases to values at and at , implying that a high-precision measurement of the EoR 21-cm PS is possible at these length-scales provided that is adequately large. The needed for a detection is hours at and it comes down at lower to hours at and . The SNR is highest at and we have SNR in hours of observations. The non-Gaussian effects make a relatively small contribution to the error predictions at this length-scale with in the range for hours. The non-Gaussian effects increase somewhat at , where we have for hours.
Considering the bin at (Figure 6) the SNR is largely system noise dominated. The SNR is well below the cosmic variance limit and increases with for the range shown in the figure except for the Case–I at . A detection will be possible with hours at , respectively. The value of the 21-cm PS falls at and the minimum observation time required for a detection increases to hours. The inherent non-Gaussianity of the 21-cm signal is important only at , where we have for hours.
We now discuss the off-diagonal elements of the covariance matrix , which is a measure of the correlation between error estimates at different bins. The off-diagonal terms of the covariance do not change with the observation time as we see in equation (8). It is convenient to consider the dimensionless correlation coefficients . The value indicates a perfect correlation between the errors at the two bins, whereas implies a complete anticorrelation. The errors in the two bins are completely uncorrelated if i.e. the two PS measurements are independent. Values and indicate partial correlation and anticorrelation, respectively. An earlier work (Mondal et al., 2017) presents a detailed analysis of the correlations evaluated from simulations. It was found that the non-Gaussianity inherent in the EoR 21-cm signal introduces a complex pattern of correlations and anticorrelations between the different bins. It was further found that these correlations (and anticorrelations) were statistically significant, i.e. they were in excess of the statistical fluctuations expected if the signal were purely a Gaussian random field. However, the earlier work did not include the effects of the baseline sampling and system noise corresponding to observations with a radio-interferometric array. For an array like SKA-Low, the correlation coefficient is dependent on the observation time through the diagonal elements , which appear in the denominator. As discussed earlier, the values of are typically large for small where they are system noise dominated. The relative significance of the correlations between the errors in different bins is small for small where has small values. The relative significance of these correlations increases as approaches the CV and we have considered hours for our analysis. The values of will increase if we consider a larger observation time.
Considering Figure 7, we see that in addition to (by definition) for all the diagonal elements, we have both positive and negative values of . The redshifts and show very similar features with a positive correlation () between the two smallest bins (), and the third bin () is anticorrelated ( to ) with the two smaller bins and one larger bin (). The nature of these correlations changes at , where the first five bins () are correlated. Of these, the four largest bins are strongly correlated () among themselves whereas the smallest bin is only mildly correlated () with the other bins. At , the first three bins are correlated () whereas the fifth bin shows anticorrelations () with the second and third bins. Considering , the first two bins are anticorrelated () with the other bins while the next five bins show strong correlations (). We thus see that there are noticeable correlations and anticorrelations between the errors in the estimated 21-cm PS in different bins at all stages of reionization. These correlations span a wide range of modes depending on the redshift.
5 Effects of Foregrounds
Foregrounds, which are almost order magnitude larger than the EoR 21-cm signal (e.g. Ghosh et al. 2012), are a major challenge for measuring the EoR 21-cm PS. There are several approaches that have been proposed to handle the foreground problem, one of these being foreground removal (e.g. Morales et al. 2006; Ali et al. 2008; Harker et al. 2009; Parsons et al. 2012; Bonaldi & Brown 2015; Chapman et al. 2015; Pober et al. 2016b). The entire analysis until now has assumed that the foregrounds have been perfectly modelled and removed, following Chatterjee & Bharadwaj (2018) we refer to this as as the “Optimistic” scenario in the subsequent discussion.
The foreground contribution to the 21-cm PS is predicted to be localized within a wedge in the () plane (Datta et al., 2010), the boundary of this wedge being defined through (Morales et al., 2012)
[TABLE]
where is the maximum angular position in the sky (relative to the telescope pointing) from which foregrounds contaminate the signal. The modes outside this foreground wedge are expected to be free of foreground contamination, and the ‘foreground avoidance’ technique (e.g. Pober et al. 2013; Kerrigan et al. 2018) proposes to utilize only these modes to estimate the EoR 21-cm PS. Typically corresponding to the horizon that is the maximum angle from which the foregrounds contaminate the signal. However, it is possible to taper the telescope’s field of view (Ghosh et al., 2011; Choudhuri et al., 2016a) and thereby restrict to an angle smaller than the horizon. Here, in addition to we also consider a situation in which we assume that tapering is used whereby where is the Full Width Half Maxima of the SKA-Low primary beam. Note that changes with frequency and it is at . Following Chatterjee & Bharadwaj (2018), we refer to the two cases and as the ‘Moderate’ and ‘Pessimistic’ scenarios, respectively.
Figure 8 shows the SNR for detecting the EoR 21-cm PS at different bins for various values considering the non-Gaussian error covariance for Case–I. Starting from the left, the three columns show the predictions for the Optimistic, Moderate and Pessimistic scenarios, respectively, while the upper and lower rows correspond to and hours respectively. The first point to note is that a few bins for which all the modes are within the foreground wedge are excluded from the detection of the EoR 21-cm PS. These excluded bins occur at the two extremities (large and small ). Further in equation (16) the factor causes the extent of the foreground wedge to increase with ( also increases with in the Moderate scenario) and we see that the extent of the excluded bins increases at higher redshifts.
In each bin the number of modes that can be used for measuring the 21-cm PS decreases from the the Optimistic to the Moderate and then the Pessimistic scenarios. This causes the SNR to decrease from the Optimistic to the Moderate scenario, and the SNR decreases even further for the Pessimistic scenario. The range where the SNR exceeds does not change very much from the Optimistic to Moderate scenario for both and hours, except for a small raising of the lower limit. The lower limit for a detection increases significantly for the Pessimistic scenario, however the upper limit is not much affected outside the excluded bins. In all cases the SNR peaks at . Considering the region where the SNR exceeds , we see that for the Optimistic scenario with hours this spans from and to . The range shrinks to and for the Moderate scenario and shrinks even further to a very small region around and for the Pessimistic scenario. The range where the SNR exceeds increases significantly if the observing time is increased to hours, this is particularly prominent for the Pessimistic scenario where both the and ranges are considerably increased compared to hours.
Figure 9 shows the percentage deviation of the non-Gaussian error predictions (Case–I) with respect to the Gaussian predictions. Considering the Optimistic scenario discussed in the previous section (Figure 3), the deviations are prominent at the smallest bin for and and also in the range at . The number of modes in each bin gets reduced due to the foreground wedge, and consequently the relative contribution to the error covariance (equation 11) from the trispectrum is reduced. We therefore expect progressively smaller values of as we go from the Optimistic to the Moderate and the Pessimistic scenarios. Considering the Moderate scenario, the results are similar to the Optimistic ones, however the values of are somewhat smaller though they still exceed (and in some cases). For the Pessimistic scenario, however, the values of are considerably smaller and they do not exceed for hours whereas they exceed only in the range at for hours.
Figure 10 shows the correlations between the different bins induced by the non-Gaussianity considering hours. We have restricted the analysis to , where we have prominent deviations from the Gaussian predictions for all the three scenarios. Comparing the Optimistic scenario with the lower left panel of Figure 7, which shows the same for hours we find that the extent of the positive correlation increases by one bin and the values of the correlation coefficients also increase. Comparing the left and centre panels of Figure 10, we see that the pattern of correlations and anticorrelations has the same extent for the Optimistic and Moderate scenarios, however the magnitudes of decrease by . Considering the Pessimistic scenario, we find that the anticorrelation between the two smallest bins and the larger bins is not noticeable here. The extent of the bins with positive correlations is the same as the Optimistic scenario, but the values of are smaller. Considering other redshifts for which the results are not shown here, we find that there are some correlations between the different bins also at in the Moderate scenario, however these are absent in the Pessimistic scenario. These correlations for the Moderate scenario are however considerably smaller and they are of the correlations seen in the bottom-right panel of Figure 7 .
Summarizing this section, we find that foregrounds restrict the modes that can be used for detecting the EoR 21-cm PS. This results in reducing the SNR and also reducing the impact of non-Gaussianity on the error predictions. The deviations from the Gaussian predictions continue to be important () at even if the effect of Foreground Avoidance is included.
6 Summary and Conclusions
There are currently several radio-interferometric arrays such as LOFAR, MWA and PAPER which have been carrying out observations to detect the EoR 21-cm PS. Several other instruments like HERA and SKA, which are expected to have greater sensitivity, are under construction or planning. It is of considerable interest to have error predictions for the EoR 21-cm PS considering such observations, and there have been several works (e.g. Mellema et al. 2013; Pober et al. 2014; Greig & Mesinger 2015; Ewall-Wice et al. 2016) addressing this under the assumption that the EoR 21-cm signal is a Gaussian random field. However there have been several studies (e.g. Bharadwaj & Pandey 2005; Mondal et al. 2015; Mondal et al. 2016, 2017; Majumdar et al. 2018) that show that the EoR 21-cm signal is non-Gaussian in nature. In this paper we study how these non-Gaussianties affect the error estimates for the EoR 21-cm PS considering observations with the upcoming SKA-Low.
The error predictions for any observation of the EoR 21-cm PS are quantified through the error covariance matrix , which depends on the PS and the trispectrum of the EoR 21-cm signal, and also observational effects like the array baseline distribution and the system noise. The EoR simulations generally provide predictions for the bin-averaged 21-cm PS and trispectrum without incorporating the observational effects. In this paper we first present a methodology for calculating combining the simulated PS and trispectrum with these observational effects. The error covariance matrix for the binned 21-cm PS (equation 4) actually depends on the trispectrum evaluated at individual pairs of Fourier modes and , unfortunately this is not available from simulations as the computations involved for a reliable estimate is extremely large and cumbersome. We have overcome this by considering two different cases where we approximate using the bin averaged trispectrum for which estimates are available from simulations. Results are mainly presented for Case–I which assumes that the different k modes within the same bin are completely correlated. We also consider Case–II which assumes the different k modes within the same bin to be totally uncorrelated. These represent two extreme cases, and the reality is expected to be somewhere in between. We find that the error predictions for Case–II are typically intermediate between the Gaussian predictions and Case–I. In most situations we may adopt a simple picture where the predictions for Case–I represent the upper limit for the error covariance matrix, and the actual errors may be expected to have values between these and the Gaussian predictions. It may however be noted that we do have a few situations where the predictions for Case–II exceed those for Case–I as seen in the lower left-hand panel of Figure 6.
We find that the predicted errors typically increase at large (Figure 1) where it is system noise dominated. In this situation the r.m.s. error scales as , and the range below which a detection of the EoR 21-cm PS is possible () increases as is increased (Figure 2). The values of also increase as reionization proceeds as increases with redshift. At all a detection is possible for hours of observation. However is largest at , and the accessible range is smaller at higher with at . The value of increases significantly for hours and we have for all . We have at all redshifts for hours. We note that at redshifts and a detection is not possible at the smallest bin , which is predicted to be cosmic variance limited (Figures 1 and 4).
The error predictions here are in excess of the Gaussian predictions that ignore the contribution from the trispectrum. At all the fractional deviation is found to exhibit a ‘U’ shaped dependence (Figure 3) in the CV limit where the system noise can be ignored. The deviations are minimum at where the ratio also is minimum, and rises steeply on both sides with particularly large values ) at . For finite observation times where the system noise is important, we have significant deviations () at for . However, for the errors are system noise dominated (except at ) and the deviations are small. At we have particularly large deviations ( and larger) at for hours.
The SNR (Figures 5 and 6) is expected to increase for small observation time where the system noise dominates the errors; we also expect the Gaussian predictions to match those for Case–I and Case–II in this regime. This is clearly seen for most redshifts at (Figure 5) and (Figure 6), which are, respectively, representative of intermediate and small length-scales. However, at we see that the SNR saturates at the CV limit beyond hours. At (Figure 4), which is representative of large length-scales, the SNR saturates within hours at all redshifts. The Gaussian predictions, Case–I and Case–II, also differ significantly, and the predictions for Case–II are typically between the Gaussian and Case–I predictions.
The inherent non-Gaussianity of the EoR 21-cm signal introduces correlations between the errors in different bins. Although () is independent of , the dimensionless correlation coefficients are dependent. We expect the correlations to become important for large , and we have presented results for hours (Figure 7). We find significant correlations and anticorrelations among the four smallest bins over the entire range. Further, we find strong correlations among some of the bins in the range at and .
The results summarized till now has not considered the foregrounds. The foreground contamination is expected to be restricted within a wedge, and only the modes outside this foreground wedge can be used for 21-cm PS detection. In addition to the Optimistic scenario where there are no foregrounds, we have also considered the Moderate and Pessimistic scenarios where the extent of the foreground wedge respectively correspond to and in equation (16). We find that for both the foreground scenarios a few bins are excluded and the SNR is reduced compared to the Optimistic scenario (Figure 8). The impact of non-Gaussianity on the error predictions is also reduced (Figure 9). The results for the Moderate scenario are comparable to those for the Optimistic scenarios, which have no foregrounds, however the predictions are considerably degraded for the Pessimistic scenario. Finally we note that the deviations from the Gaussian predictions, including correlations between the different bins, continue to be important () for all the scenarios at .
In conclusion, we note that non-Gaussian effects make a significant contribution to the error predictions, particularly at low redshifts and large length-scales. In addition to increasing the error predictions with respect to the Gaussian predictions, it also introduces significant correlations and anticorrelations between different bins.
Acknowledgments
The authors would like to thank Dr. Raghunath Ghara and Srijita Pal for the help related to the specifications of SKA-Low and baseline distributions. AKS would like to thank Dr. Anjan K. Sarkar, Debanjan Sarkar and Suman Chatterjee for the fruitful discussions. RM would like to acknowledge funding form the Science and Technology Facilities Council (grant numbers ST/F002858/1 and ST/I000976/1) and the Southeast Physics Network (SEPNet).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Ali et al. (2008) Ali S. S., Bharadwaj S., Chengalur J. N., 2008, Monthly Notices of the Royal Astronomical Society , 385, 2166 · doi ↗
- 2Ali et al. (2015) Ali Z. S., et al., 2015, The Astrophysical Journal, 809, 61
- 3Bag et al. (2018) Bag S., Mondal R., Sarkar P., Bharadwaj S., Sahni V., 2018, MNRAS , p. sty 714 · doi ↗
- 4Beardsley et al. (2013) Beardsley A. P., et al., 2013, MNRAS: Letters , 429, L 5 · doi ↗
- 5Beardsley et al. (2016) Beardsley A. P., et al., 2016, The Astrophysical Journal, 833, 102
- 6Becker et al. (2001) Becker R. H., et al., 2001, The Astronomical Journal, 122, 2850
- 7Becker et al. (2015) Becker G. D., Bolton J. S., Madau P., Pettini M., Ryan-Weber E. V., Venemans B. P., 2015, MNRAS , 447, 3402 · doi ↗
- 8Bernardi, G. et al. (2009) Bernardi, G. et al., 2009, A&A , 500, 965 · doi ↗
