
TL;DR
This paper introduces a spherical ansatz for parameter-space metrics in signal analysis, improving the approximation of waveform mismatch at large separations, which enhances template bank construction for gravitational-wave searches.
Contribution
The paper proposes a new spherical ansatz for the mismatch metric that remains bounded and accurate at large parameter separations, unlike the traditional quadratic form.
Findings
The spherical ansatz matches the metric for small separations.
It provides a better approximation for large separations.
Potentially improves template bank accuracy in semi-coherent searches.
Abstract
A fundamental quantity in signal analysis is the metric on parameter space, which quantifies the fractional "mismatch" between two (time- or frequency-domain) waveforms. When searching for weak gravitational-wave or electromagnetic signals from sources with unknown parameters (masses, sky locations, frequencies, etc.) the metric can be used to create and/or characterize "template banks". These are grids of points in parameter space; the metric is used to ensure that the points are correctly separated from one another. For small coordinate separations between two points in parameter space, the traditional ansatz for the mismatch is a quadratic form . This is a good approximation for small separations but at large separations it diverges, whereas the actual mismatch is bounded. Here we introduce and discuss a simple…
| Mismatch | Metric approximation | Spherical approximation |
|---|---|---|
| grid spacing | grid spacing | |
| 0.01 | 0.173/ | 0.173/ |
| 0.02 | 0.245/ | 0.246/ |
| 0.05 | 0.387/ | 0.391/ |
| 0.1 | 0.548/ | 0.557/ |
| 0.2 | 0.775/ | 0.803/ |
| 0.5 | 1.225/ | 1.360/ |
| 0.7 | 1.449/ | 1.717/ |
| 0.9 | 1.643/ | 2.163/ |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Spherical ansatz for parameter-space metrics
Bruce Allen
MPI for Gravitational Physics, Callinstrasse 38, Hannover, Germany
Abstract
A fundamental quantity in signal analysis is the metric on parameter space, which quantifies the fractional “mismatch” between two (time- or frequency-domain) waveforms. When searching for weak gravitational-wave or electromagnetic signals from sources with unknown parameters (masses, sky locations, frequencies, etc.) the metric can be used to create and/or characterize “template banks”. These are grids of points in parameter space; the metric is used to ensure that the points are correctly separated from one another. For small coordinate separations between two points in parameter space, the traditional ansatz for the mismatch is a quadratic form . This is a good approximation for small separations but at large separations it diverges, whereas the actual mismatch is bounded. Here we introduce and discuss a simple “spherical” ansatz for the mismatch . This agrees with the metric ansatz for small separations, but we show that in simple cases it provides a better (and bounded) approximation for large separations, and argue that this is also true in the generic case. This ansatz should provide a more accurate approximation of the mismatch for semi-coherent searches, and may also be of use when creating grids for hierarchical searches that (in some stages) operate at relatively large mismatch.
I Matched filtering and the overlap between templates
More than two decades ago, when the first generation of interferometric gravitational wave (GW) detectors were still in the planning stages, a handful of pioneers investigated the techniques that would be needed to detect GW signals Schutz (1989); 198 (1989); Sathyaprakash and Dhurandhar (1991); Schutz (1991); Finn and Chernoff (1993); Cutler et al. (1993); Sathyaprakash (1994); Cutler and Flanagan (1994); Apostolatos et al. (1994); Apostolatos (1995). At that time there were three main challenges. First, the signals were weak in comparison with the noise from the detectors, so needed to be “teased out” of the data stream with optimal or near-optimal methods. Second, the parameters describing the signals (such as the object masses in a binary system, or the rotation frequency and spindown rate of a neutron star) were not known. This required repeated searches for signals with many different parameter combinations, creating a significant computational challenge. Lastly, even if the parameters were known precisely, for some sources the waveforms could only be calculated approximately. The errors could be estimated but not sharply quantified.
The solution to the first problem is to use “matched filtering” Sathyaprakash and Dhurandhar (1991); Cutler et al. (1993); Dhurandhar and Sathyaprakash (1994); Dhurandhar and Schutz (1994); Cutler and Flanagan (1994); Balasubramanian and Dhurandhar (1994); Sathyaprakash (1994); Balasubramanian et al. (1996a, b); Owen and Sathyaprakash (1999). In the simplest case 111In many cases of interest a quantity corresponding to SNR is analytically maximized over some intrinsic phase parameters. This results in a “squared-SNR” detection statistic which is quadratic in the data. The mismatch we define then corresponds to the fractional loss of expected squared SNR in the strong signal limit. the time-dependent output of the detector is correlated with a template to produce a statistic
[TABLE]
If the template is normalized then is called the signal-to-noise ratio (SNR). This is reviewed in a signal-processing context in Wainstein and Zubakov (1970); Helstrom and Wilson (1970) and in the GW context in Jaranowski and Krolak (2009) and Creighton and Anderson (2011).
The positive-definite inner product in Eq. (1) can be expressed in different ways. For example if the instrument noise is white (or the signal is confined to a narrow enough range of frequency that the noise is white in that band) then the inner product is
[TABLE]
where the integral extends over the support of the waveform or the duration of the data (whichever is shorter). The normalization constant is set by requiring that the expected value of is unity where is the detector output in the absence of any signals 222If the signal is narrow-band, then to get the most SNR the instrument output should be filtered so that only contains the band of interest.
If the detector noise is colored Dhurandhar and Sathyaprakash (1994) then the inner product is most simply expressed in the frequency domain as
[TABLE]
Here, the Fourier transform of a function of time is denoted by , where is frequency, and is the (single-sided) noise power spectrum of the instrument.
For real instrument data sampled at a finite rate the integral in Eq. (2) may be replaced with a sum over samples and the integral in Eq. (3) may be replaced with a sum over Nyquist-sampled frequency bins Allen et al. (2012).
The solution to the second problem is to construct the SNR in Eq. (1) for many different templates , where are the parameters that describe the waveform and the integer labels a finite set of distinct points which are being sampled from parameter space Sathyaprakash and Dhurandhar (1991); Dhurandhar and Sathyaprakash (1994); Apostolatos (1995, 1996); Owen and Sathyaprakash (1999); Cokelaer et al. (2004); Babak et al. (2006); Harry et al. (2009); Messenger et al. (2009); Manca and Vallisneri (2010); Brown et al. (2013); Fehrmann and Pletsch (2014); Kumar et al. (2014); Roy et al. (2017a). denotes the collection of coordinates in parameter space; the individual coordinates are denoted by where the index runs over the parameter-space coordinates.
For GWs from compact binary coalescence (CBC) includes the masses of the objects, sky location, orbital inclination, time of the merger, spins (if relevant) and so on. For continuous gravitational waves (CW) from a spinning neutron star includes the sky location, frequency and frequency derivative, and so on 333Some parameters are NOT included in . For example the (normalized) templates are independent of the distance to the source. For search techniques based on fast Fourier transforms (FFTs) the coalescence time or signal phase might be left out because they are automatically searched over. Brady et al. (1998); Brady and Creighton (2000); Jaranowski et al. (1998); Pisarski et al. (2011); Wette et al. (2018).
The set of templates is called a template bank, and the art is in selecting their locations Owen (1996); Owen and Sathyaprakash (1999); Chen (2003); Cokelaer et al. (2004); Croce et al. (2004); Babak et al. (2006); Sahay (2006); Pan (2006); Cokelaer (2007a); Prix (2007a); Cokelaer (2007b); Whelan et al. (2008); Babak (2008, 2008); Ajith et al. (2008); Cokelaer and Pathak (2009); Messenger et al. (2009); Ajith et al. (2009); van den Broeck et al. (2009); Harry et al. (2009); Whelan et al. (2010); Manca and Vallisneri (2010); Pisarski et al. (2011); Brown et al. (2013); Keppel (2013); Kumar et al. (2014); Ajith et al. (2014); Fehrmann and Pletsch (2014); Pisarski and Jaranowski (2015); Chua and Gair (2016); Indik et al. (2017); Roy et al. (2017a); Dal Canton and Harry (2017); Roy et al. (2017b); Wette et al. (2018); Mukherjee et al. (2018); Roy et al. (2018, 2019); Roulet et al. (2019). Since the signals themselves come from a continuous family and the template bank is discrete, real signals will not have parameters that exactly match any template in the bank. So one must ensure that there is at least one template “close enough” to the signal that it is not missed. At the same time, since must be computed for each template, the number of templates should be no larger than needed. For the advanced LIGO and Virgo instruments, the CBC searches employ templates; the CW searches employ orders of magnitude more.
To place the templates in parameter space, an important quantity is the overlap (also called a fitting function or match) between two templates
[TABLE]
Because the templates are normalized, and the inner product is positive definite, the overlap lies in the closed interval 444Typically waveforms can have either sign, so one can constrain this overlap to the closed interval ..
The overlap is also relevant to the third of the challenges described above because it may be used to quantify the loss in SNR arising from inaccuracies in the waveform models. However in this paper we assume that the waveform models are exact, and concentrate on the previous issue.
II The mismatch and the metric approximation to the mismatch
Rather than using the overlap, it is more convenient to use a related quantity called the mismatch, but the literature contains several different definitions for this. Much of the work on CBC data analysis uses the mismatch and much of the literature on CW signals uses . Here, we follow the latter convention, defining the mismatch as
[TABLE]
This mismatch lies in the interval and is the fractional loss in the square of the expected SNR that arises when a signal with parameters is detected using a template with parameters . Sec. VIII gives results for another common definition, where the mismatch is the fractional loss of the expected SNR . In the Neyman–Pearson approach, is the fractional loss in the maximum of the log likelihood ratio in the strong signal limit.
We note that a signal search algorithm may (either analytically or explicitly) minimize the mismatch with respect to some of the intrinsic or extrinsic parameters. In this case we assume that these parameters are not included in the vector and that the right-hand side (rhs) of the expression in Eq. (5) for is minimized over those missing parameters 555 Suppose that the mismatch is minimized with respect to template parameters , and that the templates are a continuous function of these parameters. Then the results given here still hold, because at the extrema one may express each of these parameters as a function of the remaining free parameters, i.e. one has . In hierarchical searches, the mismatch may also be averaged over data segments; we return to this in Sec. VII.
It is helpful to think of the normalized templates and as unit-length vectors which lie on the surface of the unit sphere as illustrated in Fig. 1. In the case where the data and template are discretely sampled, is the number of samples in the template. In the continuous case is infinite and the sphere is embedded in a Hilbert space 666This is very similar to the way that a normalized wave-function in quantum mechanics is expressed as an infinite sum of coefficients multiplying basis vectors..
We define the angle between two normalized templates via
[TABLE]
so that the mismatch may be expressed as
[TABLE]
Since the mismatch is extremal and vanishes at it can be expanded in a Taylor series which (generically) begins at quadratic order.
This “metric approximation” to the mismatch has a geometrical interpretation which was introduced in Owen (1996) and elaborated in Balasubramanian et al. (1996a, b); Owen and Sathyaprakash (1999). It is
[TABLE]
where , and we adopt the “Einstein summation convention” that repeated indices are summed from to . The quantity is called the parameter-space metric Owen (1996); for nearby templates, measures the squared fractional deviation or squared dimensionless “interval” between the templates.
We note that there are other possible definitions of the metric, but this choice is normally adopted for the template placement problem, because templates must be placed “independently of the data” based on the expected properties of the signals and detector noise. A good discussion of this and of other possible definitions of parameter-space metrics may be found in the Introduction and in Appendix A of Prix Prix (2007b, c).
III Simple illustrative example
To make this concrete, we consider a simple CW example. The waveforms are described by a single (angular) frequency parameter . In the time domain the normalized templates are
[TABLE]
and vanish for . In the cases of interest would be days to years, and would be tens to hundreds of cycles per second.
The overlap and mismatch between two templates may be easily computed, starting from Eq. (2) with :
[TABLE]
For the cases of interest is large enough that there are many cycles in the interval , and the fractional difference between and is small. This means that the second term on the rhs of Eq. (10) is negligible, so the mismatch is given by the square of the sinc function
[TABLE]
where . This may be expanded as a Taylor series for small , yielding . Thus the metric is and the metric approximation to the mismatch is
[TABLE]
IV The metric approximation and the spherical approximation
Shown in Fig. 2 (blue) is the actual mismatch as a function of , as given by Eq. (11). Also shown (orange) is the metric approximation from Eq. (12). One can see that these agree well for small values of , but that the metric approximation breaks down when . One can also see that where they deviate, the quadratic approximation tends to overestimate the mismatch. This is well known to the experts 777About two decades ago, Benjamin Owen pointed this out to me, and told me that it was an effect arising from “projection onto the sphere”. He was right! and frequently observed when the metric approximation is compared to the true mismatch. Below we provide both the explanation and a simple solution.
It is helpful to visualize this on the sphere. Imagine that we have a path in parameter space, parameterized by the variable as shown in Fig. 1, which passes through the template at parameter value , and at parameter value . A generic parameterization is one for which the angle varies linearly with for small values of 888Note that the exact behavior of the mismatch as a function of the parameter coordinate separation depends upon the parameterization of the waveforms. Any non-linear transformation of the parameter will change this behavior, but a “generic” choice of parameterization will not. A generic parameterization is one for which the angle sweeps steadily along the sphere..
For such a generic parameterization, the angular separation on the sphere is well approximated by
[TABLE]
This means that the mismatch can be written (in what we here call the “spherical approximation”) as
[TABLE]
The point of this short paper is that Eq. (14) is a better approximation to the generic mismatch than the more conventional approximation . While both of these approximations agree to lowest order in the parameter separation , for the generic case the approximation given in Eq. (14) will be accurate for a larger range in . It also has the advantage of always lying in the interval .
The simple example presented in Sec. III is a good demonstration of this. Fig. 2 shows how the behavior of the conventional metric approximation (orange curve) deviates from the actual mismatch (blue curve) as the parameter mismatch increases. The spherical approximation (green curve) given by
[TABLE]
is a much better fit to the actual mismatch. It deviates significantly from the actual mismatch only after the path on the sphere has exceeded a 90-degree separation between and .
V A second example
The reader might ask if we have “tuned” our example in Sec. III. This is not so: the dependence upon the parameter (frequency) is quite typical. Here we present another typical case, where the signal model depends upon an offset phase parameter .
For this example, the normalized time domain templates are
[TABLE]
and vanish for . As in our previous example, we assume that the signal goes through many cycles in the observation interval, so that is large.
The overlap between templates is easily calculated from 2, giving
[TABLE]
Since we are assuming that , the second term on the rhs can be neglected, giving the mismatch
[TABLE]
where .
In this case, the metric approximation to the mismatch yields the quadratic form . In contrast, the spherical approximation to the mismatch gives
[TABLE]
So in our second example, the spherical approximation is exact!
VI A third example
In our final example, the signal parameter is a constant frequency derivative . The normalized time domain templates are
[TABLE]
and vanish for . We assume that (half of the) dimensionless phase accumulated during the observation time is much larger than .
The overlap between templates is
[TABLE]
where is the Fresnel integral function, , and we have dropped small terms from the rhs in the third line of Eq. (21).
The exact mismatch is given by
[TABLE]
This is plotted in blue in Fig. 3. Since for small z, the normal metric approximation to the mismatch is . This is plotted in orange. Shown in green is the spherical approximation to the mismatch, .
As in the previous examples, the spherical approximation is a better fit to the true mismatch.
VII Why does it matter?
Why does this matter? After all, the metric approximation is only defined to quadratic order, and the mismatch can be expanded to higher order if needed. The point here is that there are really two approximations taking place. The first is in the Taylor approximation of the separation on the sphere, and the second is in the Taylor approximation of the sin function in the expression which relates the mismatch to . The conventional metric ansatz makes both of these approximations, whereas the spherical ansatz only uses the first of these approximations. So for generic behavior of the path in parameter space, the spherical ansatz will be more accurate than the metric ansatz. And since the spherical approximation just replaces with , this comes with no additional analytic or computational cost.
A more accurate approximation is useful because the metric is often used to construct grids in parameter space Mohanty and Dhurandhar (1996); Mohanty (1998); Allen et al. (1999); Cokelaer et al. (2004); Babak et al. (2006); Prix (2007a); Brown et al. (2007); Cokelaer (2007b); Ajith et al. (2008); Babak (2008); Messenger et al. (2009); Harry et al. (2009); Manca and Vallisneri (2010); Pisarski et al. (2011); Brown et al. (2013); Kumar et al. (2014); Fehrmann and Pletsch (2014); Wette (2015); Roy et al. (2017a). In situations where a search is not compute-power limited, these grids typically have a low mismatch. For example in CBC searches the traditional SNR mismatch is chosen at 3%, corresponding to an SNR2 mismatch of 6%. For such small mismatches, the fact that for small means that there is no significant difference between the metric and spherical approximations. However this may not be so for searches which are compute-power limited, for example in the search for CWs or the search for gamma-ray pulsars.
These computationally-limited searches often employ multiple hierarchical stages, which mix semi-coherent and coherent stages, each employing its own metric for template placement Prix (2007b, c, a); Messenger et al. (2009); Pletsch and Allen (2009); Pletsch (2010, 2011); Pletsch et al. (2012); Shaltev and Prix (2013); Wette and Prix (2013); Pletsch and Clark (2014); Leaci and Prix (2015); Abbott et al. (2017a, b); Clark et al. (2017); Wette et al. (2018); Dreissigacker et al. (2018); Nieder et al. (2018). Those hierarchical stages sometimes operate at substantial mismatches in the range , and here, the spherical approximation is an improvement on the conventional quadratic approximation.
We can illustrate this using the example from Sec. III. Suppose we set up a one-dimensional grid in frequency , with a spacing picked to give a desired mismatch . The metric approximation in Eq. (12) gives a parameter-space grid spacing
[TABLE]
whereas the spherical approximation gives a grid spacing
[TABLE]
Effectively, the spherical approximation amounts to replacing the conventional metric mismatch with . The effect of this on the grid spacings is shown in Table 1.
The spherical approximation might also provide a significant improvement for semi-coherent searches, when compared with the normal quadratic metric approximation. Semi-coherent methods are employed for computationally-limited electromagnetic and GW searches, and consist of breaking a long data stream into shorter “computationally-feasible” segments, each of which is searched using traditional matched-filter methods. The resulting “coherent” statistics (typically SNR values) are then summed to produce the semi-coherent statistic, as first proposed in Brady et al. (1998); Brady and Creighton (2000). To set up a template grid one computes a semi-coherent metric to predict the fractional loss of the semi-coherent statistic. Until now has been computed by summing or averaging the coherent metrics for the segments of the coherent searches Pletsch and Allen (2009); Pletsch (2010); Wette and Prix (2013); Wette (2015); Leaci and Prix (2015). This averaged metric can be a poor approximation, and recent work has investigated its accuracy and empirical ways to extend the range of validity Wette (2016).
This work suggests a possible improvement. Instead of estimating the semi-coherent mismatch with an averaged metric
[TABLE]
it might be more accurate to instead compute the semi-coherent mismatch in the spherical approximation:
[TABLE]
Because the sin-squared of the average is not the average of the sin-squared, Eqs. 25 and 26 could differ substantially, particularly if the quadratic approximation to the metric significantly overestimates the mismatch in one or more of the coherent segments.
VIII Conventions for overlap and mismatch
In much of the CBC literature the mismatch is defined as
[TABLE]
This SNR fractional mismatch should be contrasted with the SNR2 fractional mismatch defined in Eq. (5). With this definition of the mismatch, the same considerations as above give the spherical approximation as
[TABLE]
This should be contrasted with the spherical approximation given in Eq. (14).
IX Conclusion
For three typical examples, we have shown that replacing the conventional metric mismatch with gives a better approximation to the true template mismatch. We have argued that this is to be expected in the generic case, and suggested that averaging the spherical approximation might provide a more accurate way to compute the mismatch in semi-coherent searches.
I thank Curt Cutler, Maria Alessandra Papa and Reinhard Prix for encouragement and helpful comments, and Ben Owen and B.S. Sathyaprakash for teaching me about matched filtering and metrics.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Schutz (1989) B. F. Schutz, Classical and Quantum Gravity 6 , 1761 (1989) . · doi ↗
- 2198 (1989) NATO Advanced Science Institutes (ASI) Series C , Vol. 253 (1989).
- 3Sathyaprakash and Dhurandhar (1991) B. S. Sathyaprakash and S. V. Dhurandhar, Phys. Rev. D 44 , 3819 (1991) . · doi ↗
- 4Schutz (1991) B. F. Schutz, in The Detection of Gravitational Waves , edited by D. G. Blair (1991) p. 406.
- 5Finn and Chernoff (1993) L. S. Finn and D. F. Chernoff, Phys. Rev. D 47 , 2198 (1993) , ar Xiv:gr-qc/9301003 [gr-qc] . · doi ↗
- 6Cutler et al. (1993) C. Cutler, T. A. Apostolatos, L. Bildsten, L. S. Finn, E. E. Flanagan, D. Kennefick, D. M. Markovic, A. Ori, E. Poisson, G. J. Sussman, and K. S. Thorne, Phys. Rev. Lett. 70 , 2984 (1993) , ar Xiv:astro-ph/9208005 [astro-ph] . · doi ↗
- 7Sathyaprakash (1994) B. S. Sathyaprakash, Phys. Rev. D 50 , R 7111 (1994) , ar Xiv:gr-qc/9411043 [gr-qc] . · doi ↗
- 8Cutler and Flanagan (1994) C. Cutler and É. E. Flanagan, Phys. Rev. D 49 , 2658 (1994) , ar Xiv:gr-qc/9402014 [gr-qc] . · doi ↗
