A nonparametric method to assess significance of events in search for gravitational waves with false discovery rate
Hirotaka Yuzurihara, Shuhei Mano, Hideyuki Tagoshi

TL;DR
This paper introduces a nonparametric, assumption-free method to evaluate the significance of gravitational wave events using false discovery rate, providing a more straightforward alternative to existing techniques.
Contribution
It proposes a novel non-parametric approach to estimate p-values and assess event significance via q-values, differing from traditional methods like P_astro in gravitational wave analysis.
Findings
Consistent significance assessment for most known events.
Differences found in significance for some marginal events.
Method applicable to other gravitational wave searches.
Abstract
In this paper, we present a consistent procedure to assess the significance of gravitational wave events observed by laser interferometric gravitational wave detectors based on the background distribution of detection statistic. We propose a non-parametric method to estimate -value. Based on the estimated -values, we propose a new procedure to assess the significance of a particular event with -value which is the minimum false discovery rate that can be attained when calling the event significant. The -value gives us a criterion on the significance of events which is different from which is used in the LIGO-Virgo analysis and in other analysis. The proposed procedure is applied to the 1-OGC and 2-OGC catalogs [2][3]. For most of the events which were claimed significant in [2] and [3], we also obtain the same results. However, there are differences in theβ¦
| Called significant | Called not significant | Total | |
|---|---|---|---|
| Noise | |||
| Signal | |||
| Total |
| UTC time | (year) | -value | -value |
|---|---|---|---|
| 150914+09:50:45 | |||
| 151226+03:38:53 | |||
| 151012+09:54:43 | |||
| 151019+00:23:16 | |||
| 150928+10:49:00 | |||
| 151218+18:30:58 | |||
| 160103+05:48:36 | |||
| 151202+01:18:13 | |||
| 160104+03:51:51 | |||
| 151213+00:12:20 |
| UTC time | (year) | -value | -value | ||
|---|---|---|---|---|---|
| 150914+09:50:45 | |||||
| 151226+03:38:53 | |||||
| 151012+09:54:43 | |||||
| \hdashline160103+05:48:36 | |||||
| 151213+00:12:20 | |||||
| \hdashline151216+18:49:30 | |||||
| 151222+05:28:26 | |||||
| 151217+03:47:49 | |||||
| 151009+05:06:12 | |||||
| 151220+07:45:36 |
| UTC time | (year) | -value | -value |
|---|---|---|---|
| 170104+10:11:58 | |||
| 150914+09:50:45 | |||
| 151226+03:38:53 | |||
| 170823+13:13:58 | |||
| 170817+12:41:04 | |||
| 170814+10:30:43 | |||
| 170809+08:28:21 | |||
| 170608+02:01:16 | |||
| 151012+09:54:43 | |||
| 170729+18:56:29 | |||
| 170121+21:25:36 | |||
| 170727+01:04:30 | |||
| 170818+02:25:09 | |||
| 170722+08:45:14 | |||
| 170321+03:13:21 | |||
| 170310+09:30:52 | |||
| 170809+03:55:52 | |||
| 170819+07:30:53 | |||
| 170618+20:00:39 | |||
| 170416+18:38:48 | |||
| 170331+07:08:18 | |||
| 151216+18:49:30 | |||
| 170306+04:45:50 | |||
| 151227+16:52:22 | |||
| 170126+23:56:22 | |||
| 151202+01:18:13 | |||
| 170208+20:23:00 | |||
| 170327+17:07:35 | |||
| 170823+13:40:55 | |||
| 150928+10:49:00 |
| UTC time | (year) | -value | -value | |
|---|---|---|---|---|
| 170104+10:11:58 | ||||
| 150914+09:50:45 | ||||
| 151226+03:38:53 | ||||
| 170823+13:13:58 | ||||
| 170814+10:30:43 | ||||
| 151012+09:54:43 | ||||
| 170809+08:28:21 | ||||
| 170729+18:56:29 | ||||
| 170608+02:01:16 | ||||
| 170121+21:25:36 | ||||
| 170727+01:04:30 | ||||
| 170818+02:25:09 | ||||
| 170304+16:37:53 | ||||
| \hdashline151205+19:55:25 | ||||
| \hdashline170425+05:53:34 | ||||
| 170201+11:03:12 | ||||
| 151217+03:47:49 | ||||
| 151011+19:27:49 | ||||
| 151216+09:24:16 | ||||
| 170403+23:06:11 | ||||
| 170202+13:56:57 | ||||
| 170629+04:13:55 | ||||
| 170220+11:36:24 | ||||
| 170721+05:55:13 | ||||
| 170123+20:16:42 | ||||
| 170801+23:28:19 | ||||
| 170818+09:34:45 | ||||
| 170620+01:14:02 | ||||
| 151216+18:49:30 | ||||
| 170104+21:58:40 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPulsars and Gravitational Waves Research
\preprintnumber
XXXX-XXXX
A nonparametric method to assess significance of events in search for gravitational waves with false discovery rate
Hirotaka Yuzurihara
Institute for Cosmic Ray Research, The University of Tokyo, Higashi-Mozumi 238, Kamioka-cho, Hida-shi, Gifu 506-1205 Japan
ββ
Shuhei Mano
The Institute of Statistical Mathematics, 10-3 Midori-cho, Tachikawa, Tokyo 190-8562, Japan
ββ
Hideyuki Tagoshi
Institute for Cosmic Ray Research, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8582, Japan
Abstract
In this paper, we present a consistent procedure to assess the significance of gravitational wave events observed by laser interferometric gravitational wave detectors based on the background distribution of detection statistic. We propose a non-parametric method to estimate -value. Based on the estimated -values, we propose a new procedure to assess the significance of a particular event with -value which is the minimum false discovery rate that can be attained when calling the event significant. The -value gives us a criterion on the significance of events which is different from which is used in the LIGO-Virgo analysis and in other analysis. The proposed procedure is applied to the 1-OGC and 2-OGC catalogs N18 N20 . For most of the events which were claimed significant in N18 and N20 , we also obtain the same results. However, there are differences in the significance for several marginal events. Since the proposed procedure does not require any assumptions on signal and noise, it is very simple and straightforward. The procedure is also applicable to other searches for gravitational waves whose background distribution of detection statistic is difficult to know.
\subjectindex
gravitational wave search, compact binary coalescence
1 Introduction
The first gravitational wave event from binary black hole coalescence, GW150914, was observed by advanced LIGO detectors in the first observing run (O1) GW150914 . After the first detection, tens of gravitational wave events were reported LV18 . During the second observing run (O2), the first gravitational waves from a binary neutron star coalescence, GW170817 GW170817 , were observed by LIGO LIGO and Virgo Virgo . The follow-up observations by electromagnetic telescopes identified the host galaxy in NGC4993. The event strongly suggests the existence of radioactive decay of rapid neutron-capture process L17 . The discovery of these events has opened the gravitational wave astronomy. During the third observing run (O3), many candidates events were reported ref:GraceDB , and four events have been published individually gw190425 ; gw190412 ; gw190814 ; gw190521 . Very recently, the GWTC-2 catalog which reports the gravitational wave signals from compact binary coalescences during the first half of O3 observation were released GWTC-2 . In the coming years the network of gravitational wave detectors consisting of two LIGO detectors, Virgo and KAGRAKAGRA plans to perform coincident observation runs. As the detectorsβ sensitivities improve and observation time becomes longer, we expect to observe more and more gravitational wave events.
In compact binary coalescence searches, we search for gravitational wave signals by maximizing the detection statistic over the template bank in a short time window. When the value of the detection statistic exceeds a given threshold, we record it as a trigger. Accordingly, for a given threshold, as the observing time and the template bank becomes larger, the probability that false triggers produced by noise (false alarm probability) becomes larger. This is called the multiple comparisons problem. Several methods have been proposed to control the false alarm probability. The Bonferroni correction is one of the method (see Chapter 9 of LR08 ). However, these methods generally reduce the detection probability while controlling the false alarm probability.
Recently, the false discovery rate (FDR) was proposed to treat these problems (see SectionΒ 3 for the formal definition of the FDR). By the authorβs knowledge, the first introduction of FDR to the gravitational wave community was done by Baggio and Prodi BP05 , but the paper did not discuss any actual problems. Recently, was introduced as a measure of true discovery of a particular event FGMC15 . In the recent catalog of gravitational waves from compact binary mergers LV18 , a candidate event is considered to have gravitational wave origin, if the false alarm rate is less than one per 30 days and the , is larger than 0.5.
In this paper, we propose the use of q-value which is a measure of FDR. We present a consistent procedure to assess the significance of candidate events by using -value. We first introduce a definition of the -value by using the background distribution of the detection statistic. Then, we propose a new procedure to evaluate -value of each event by extending the procedure proposed by Storey and Tibshirani ST03 . The original procedure by Storey and Tibshirani ST03 is not applicable for a search for gravitational waves from compact binary coalescences, because it requires a complete list of -value. However, in gravitational wave searches, a complete list of -value is usually not available because we store only triggers whose detection statistic is larger than a certain threshold. We apply these procedure to the publically available results of the analysis, 1-OGC catalog and 2-OGC catalog by Nitz et al. N18 ; N20 , and evaluate the -value of each candidate event. We compare the significance of each candidate event evaluated by using . We find that we obtain almost consistent results on the significance of each candidate event. However, we also find that, although the conclusion on the significance may change depending on the threshold for -value and , the conclusion on the significance of events can be different for marginally significant events. We find one such event in 2-OGC catalog.
The main advantage of our procedure is that our procedure is completely nonparametric, namely, we do not assume any parametric model behind data. Our procedure can be applied to other gravitational wave searches. The evaluation of -value in non-parametric way, the procedure to evaluate -value, estimation of -value for the LIGO-Virgo O1 and O2 candidate events by using this procedure, all of these are new things in this paper.
This paper is organized as follows. In section 2, we discuss statistical hypothesis testing in the search for gravitational waves from compact binary coalescences. In section 3, we present a procedure to assess a significance of a particular event with a false discovery rate. In section 4, the proposed procedure is applied to the results of the analysis of the O1 data. Section 5 is devoted to a summary and discussion.
2 Estimation of -value
We first introduce the statistical terminologies used in this paper. The definitions of statistical terminology can be found in a standard textbook, such as LR08 . By analyzing the data from gravitational wave detectors, we obtain events which have larger signal-to-noise ratio than a threshold. Each event is classified as either signal or noise. If the event is originated from a gravitational wave, it is called a signal. Otherwise, it is called a noise. In the statistical literature, the noise model is called null hypothesis (in this paper, also called background) and the signal model is called alternative hypothesis.
In the analysis of gravitational waves from compact binary coalescences, event search is done by maximizing the detection statistic over the templates. The detection statistic is also maximized over time within a certain time length.
In statistical hypothesis testing, the -value of an event is a measure of the significance of the event. It is the probability that the event or rarer events occur under the null hypothesis. If the -value of the event is significantly small, the null hypothesis is rejected. Let us consider statistical hypothesis testing of each event based on background distribution of detection statistic.
2.1 A conventional -value
In the LIGO-Virgo O1 analysis, the following -value was used A16 ; U16 (see Appendix A for discussion on the derivation)
[TABLE]
where is the detection statistic of a event. In this paper, we call this the conventional -value. Here, and are the time length of the analyzed data and the time length for the estimation of the background distribution, respectively. The estimation of the background data is usually generated by time-shifting data of different detectors U16 . Moreover, is the number of noise events in the background data whose detection statistics are equal to or larger than . It is
[TABLE]
where is the detection statistic of the -th event in the background data, if is true and 0 otherwise. From the definition, is the total number of noise events in the background data. Therefore, in Eq.(1) is the mean of number of events whose detection statistics are more than or equal to . The ratio is usually called the false alarm rate of the event whose detection statistic is .
2.2 Nonparametric estimation of -value
Now, we introduce a non-parametric method to estimate -value. Let us assume the background distribution is continuous. If we know the probability density function of detection statistic under the null hypothesis, , the -value of an event whose detection statistic is is given by
[TABLE]
In reality the background distribution is unknown, nevertheless, it can be estimated non-parametrically (free from assumption of a parameterized distribution) by using simulated background data. An estimator of the null distribution is given by
[TABLE]
It is important to distinguish and . The former is the (unknown) true background distribution, while the latter is an estimator of the background distribution. By Glivenko-Cantelliβs theorem, converges to almost surely and uniformly in LR08 . Therefore, an estimator of the -value of an event whose the detection statistics is is given by
[TABLE]
where we used the fact that where . Note that is the probability of obtaining the event whose the detection statistics is larger than in the background data and has been called (an estimator of) false alarm probability in the gravitational wave community C17 . In addition, is proportional to the mean in (1). The estimator (4) is a consistent estimator of the -value (3), namely, converges to almost surely for each by the strong law of large numbers.
For later discussion, let us recall a basic property of a -value. A -value of a statistic following any continuous null distribution follows the uniform distribution, because
[TABLE]
is the distribution function of the uniform distribution where and is the probability of . It is worthwhile to mention that we cannot expect that the conventional -value given by (1) with following , follows the uniform distribution
(see Appendix A). In the discussion that follows, we discuss the -value defined by (3).
3 Assessment of significance with false discovery rate
In this section, we describe a statistical hypothesis testing by using detection statistics and how to assess a significance with the false discovery rate. When we perform the statistical test, each event can be categorized in four possible outcomes, which are summarized in Table 1.
There are two kinds of truth (noise or signal) and two kinds of claim (called significant or called not significant). and are the number of noise and signal events called significant, respectively, and is the total number of events called significant. and are the number of noise and signal statistics, respectively. is the total number of events in the observed data.
In statistical hypothesis testing, a -value threshold is selected to keep the number of false positives small. When we select the threshold , the expected number of false positive is . If is very large, should be selected to be very small.
Here, the probability is called a familywise error rate. The familywise error rate is simply called false alarm probability in the gravitational wave community, but we call the familywise false alarm probability in this paper to avoid a confusion. The family means that we test a hypothesis by using tests. To control the familywise error rate such that , that is, the rate that a noise event is classified as called significant is less than , one of the solutions is to change the threshold to . This method is called Bonferroniβs procedure (see Chapter 9 of LR08 ).
Unfortunately, controlling the familywise error rate is practical only when extremely few events are expected to be signal. Otherwise, controlling the familywise error rate will be too conservative and statistical power of the test procedure will be too poor. Benjamini and Hochberg BH95 introduce the false discovery rate, which is defined as the expected value of , , where and are introduced in Table 1, and give a test procedure to keep the FDR less than a threshold. A fairly recent survey of an FDR is B10 . Note that the false positive rate and the FDR are quite different measures. A false positive rate of means that of noise events are called significant. On the other hand, an FDR of means that of events called significant are noise events. Controlling FDR should be more powerful than controlling familywise error rate, since FDR is less than or equals to the familywise error rate BH95 .
Storey and Tibshirani ST03 introduced the -value for a particular event, which is the expected proportion of false positives incurred if calling the event significant. Let us define FDR, which is the FDR when calling all events significant whose -value is less than or equals to a threshold where , namely,
[TABLE]
where is the expectation of given . Here, is the number of the noise events whose -value is smaller than or equals to the threshold {\color[rgb]{0,0,0}u}, and S({\color[rgb]{0,0,0}u}) is the number of both noise and signal events whose -value is smaller than or equals to the threshold {\color[rgb]{0,0,0}u}. The definition of the -value is the minimum FDR that can be attained when calling the event significant, namely,
[TABLE]
where and the -value given by (3) of the -th event are denoted by . Note that FDR({\color[rgb]{0,0,0}u}) is not always monotonically increasing in the threshold {\color[rgb]{0,0,0}u}. Taking minimum guarantees that the estimated -value is increasing in the same order as the -value.
Let us recall the procedure for estimating -value proposed by Storey and Tibshirani ST03 . Their estimator of the FDR({\color[rgb]{0,0,0}u}) is
[TABLE]
where is an estimator of which indicates the overall proportion of noise events in the data. Roughly speaking, (7) is a sample mean whose population mean is (5). Since a -value of a statistic follows the uniform distribution under the null hypothesis (see Section 2), the numerator of (7) is an estimator of F({\color[rgb]{0,0,0}u}).
How to estimate is the central issue. In the gravitational wave searches, very few events are expected to be signal. In such a case, we can assume . In Appendix B, we show that this assumption is justified by using the 1-OGC and 2-OGC catalogs. We thus set .
We can construct an estimator of the -value by plugging the estimator of the -value (4) and the estimator of the FDR (7) into the expression (6) and setting . The result is
[TABLE]
where .
4 Application to 1-OGC and 2-OGC results
In this section, we evaluate -value of events in the 1-OGC catalog N18 and in the 2-OGC catalog N20 . We use the data available at https://github.com/gwastro/1-ogc and https://github.com/gwastro/2-ogc. Available data set contains the information of events such as time, false alarm rate in a unit of year*-1*, the value of ranking statistic, two masses, dimensionless spin component value of each star perpendicular to the orbital plane, etc. The data set consists of complete and bbh data sets. There are 146,214 and 12,741 events in complete and bbh data sets of 1-OGC, and 733,231 and 502,994 events in complete and bbh data sets of 2-OGC, respectively. The complete data set contains all candidate events from full analysis, and the bbh data set contains the candidate events from the BBH region targeted analysis N18 ; N20 .
Since -value of events are not available in these catalog, we need to evaluate it from the false alarm rates (FAR). An estimate of FAR is given by where is the length of data used for background estimation, and is defined by Eq. (2). The events in the catalog are defined by taking an event which gives a maximum detection statistic within a certain time window and in the template bank used in the analysis. Thus, the total number of background, , is given as . In both 1-OGC and 2-OGC, seconds are used. Then, from Eq. (4), we obtain an estimate of -value of an event as
[TABLE]
We note that the candidate events in these data sets are not all events in the sense that only events with relatively low false alarm rates are recorded. This is due to a practical reason in order to reduce the computation time of the analysis. This is a typical situation in gravitational wave analysis.
Since all candidate events are not available, we can not use the algorithm originally proposed in ST03 , which is explained as Algorithm 2 in Appendix C. Instead, we propose an alternative procedure for estimating -value which is a modified version of AlgorithmΒ 2. Appendix C explains why AlgorithmΒ 1 yields estimates of the -value defined in (8).
Algorithm 1**.**
We compute estimates of -value defined in (8). Let to be the number of false alarm rates which are less than some value. Assume -value in the region around and larger than are noises.
Compute estimates of -value.
[TABLE]
where .
- 2.
Let be the ordered -values.
- 3.
Set .
- 4.
For , compute
[TABLE]
- 5.
The estimated -values for the -th most significant event is .
4.1 1-OGC results
In the 1-OGC catalog N18 , True Discovery Rate (TDR) and are given to evaluate the significance of events. A true discovery is the complement of the false discovery, FDR=1TDR. Note however that the evaluation of TDR in N18 is a very conservative estimate. In N18 , an estimate of TDR is defined as
[TABLE]
where is the rate that signals of astrophysical origin are observed with a ranking statistic , and is the FAR. In N18 , to estimate , two significant events GW150914 and GW151229 are assumed to be real astrophysical signals, and is obtained. In order to take into account of the uncertainty in the estimate based on only two events, the Poisson distribution is assumed for the observed number, and as a lower 95% bound, is obtained. In N18 , this value is used in (10) for all events other than GW150914 and GW151226.
On the other hand, is the posterior probability given that a particular event has astrophysical origin. In the 1-OGC catalog N18 , it is estimated as
[TABLE]
where and are the probability densities of an event having ranking statistic given the event is signal or noise, respectively, and and are the rates of signal and noise events. 111 is also called purity in other field of physics ref:purity . In order to estimate , an analytic model of the signal distribution and a fixed conservative rate of mergers are used by assuming two events (GW150914 and GW151226) are astrophysical origin. 222Note that the method to estimate in N18 is different from that used in GWTC-1 catalog by LIGO-Virgo collaboration LV18 and in 2-OGC paper N20 .
Figure 2 shows the -value computed using Algorithm 1 from -value of events in the complete data set. Table 2 summarizes the results of the estimated -value and -value for 10 most significant events.
Figure 2 shows the -values computed using Algorithm 1 from -values of events in the bbh data set. Table 3 summarizes the results of estimated -value and -value for 10 most significant events. together with the inverse of the false alarm rate, and given in the 1-OGC catalog. For the first two events, since only upper limit to the false alarm rate was evaluated in N18 , the estimated -value of these events should be considered an upper limit to -value. and are not given for the top two events in N18 , since these events are used to estimate and of other events.
Following N18 , we discuss the significance of events with bbh case. In Table 3, if we call the events whose -value is smaller than significant, the top three events are significant. The expected proportion of false discoveries incurred in the three events is less than . Since -value of GW151012 (15101209:54:43) is , this is significant enough as a true signal. In N18 , since for GW151012 is which is larger than 0.5, GW151012 is called significant. Thus, the results of -value and are consistent for this events.
In Table 3, we find two marginally not significant events, 160103+05:48:36 and 151213+00:12:20 whose -value are and respectively. On the other hand, for these events are small, and respectively. So in N18 , these two events are called not significant. Although the conclusions are the same, the significance are slightly different between -value and in N18 , and this difference might be interesting. However, since these two events do not appear in the 2-OGC catalog in the next subsection, we donβt investigate these events more.
The value of is about 1 order of magnitude larger than -value for all events. Since in N18 is a very conservative estimate, this difference is not surprising. Even in this case, for GW151012 is . Thus, this can be called significant. But for 160103+05:48:36 and 151213+00:12:20 is 0.483 and 0.545. Thus, these can not be called marginal events.
In the LIGO-Virgo GWTC-1 catalog of gravitational-waves from compact binary mergers during O1 and O2 LV18 , a necessary condition that an event is considered to be a gravitational wave signal is that the FAR of the event is less than one per 30 days, which corresponds to the -value of . By linearly fitting the data in Figs. 2 and 2, we can evaluate that this -value corresponds to the -value of and , respectively. The -value of 0.05 corresponds to one per 271 days and one per 246 days of FAR, respectively. The -value threshold of 0.05 is more stringent than the FAR of one per 30 days.
When we compare -value of same event, the -value in Table 3 is smaller than that in TableΒ 2. The reason for this difference is that events in the data set are computed from the different number of templates. The small number of templates decreases the false alarm rate and -value. Accordingly, it produces different -value.
4.2 2-OGC results
Figure 4 shows the -values as a function of -values in the complete data set. Table 4 summarizes the results of the estimated -values of the events for 30 most significant events.
Figure 4 shows the -values as a function of -values in the bbh data set. Table 5 summarizes the results of estimated -values for top 30 events. computed in the 2-OGC paper N20 is also shown in this table. 333The method to estimate in N20 is based on a mixture model developed in Farr et al. FGMC15 and employed in GWTC-1 catalog by LIGO-Virgo collaboration LV18 .
We discuss the significance of events for bbh case. In table 5, if we call the events whose -value is smaller than significant, the top 13 events are called significant. In N20 , these 13 events are called significant since is larger than 0.5. Thus, the results of -value and are consistent each other. On the other hand, we obtain a different result for 151205+19:55:25. The -value of this event is 0.07, while is 0.525. Thus, this is definitely a marginal event. If we call events with -value less than 0.05 significant, this event can not be called significant. On the other hand, in N20 , this event is called significant, since is larger than 0.5, and is identified as a new marginal binary black hole merger, GW151205.
Finally, we investigate correspondence between -value and FAR. By linearly fitting the data in Figs. 4 and 4, we can evaluate that -value of corresponds to the -value of and , respectively. The -value of 0.05 corresponds to one per 384 days and one per 319 days of FAR, respectively. Thus, as in the case of 1-OGC, the -value threshold of 0.05 is more stringent than the FAR of one per 30 days.
5 Summary and Discussion
In this paper, we presented a consistent procedure to assess the significance of each event. We proposed an estimator of the -values (4) of a particular event in the statistical hypothesis testing by using the empirical distribution of detection statistic without any assumption on the background distribution. Generally, -value should follows a uniform distribution if all events are originated from noise. The -value defined in (4) has this property. On the other hand, the defined in (1) does not have this property in general. We thus believe that the -value in (4) is more useful to assess the significance of each event than in (1). Moreover, we proposed a consistent procedure to evaluate -value which is a measure of FDR. In this procedure, we use a property that -values follow the uniform distribution under the null hypothesis, and we donβt need any assumptions on the distribution of signals. We apply this procedure to 1-OGC and 2-OGC catalog data N18 N20 . There is already a procedure which was introduced to evaluate -value in the literature ST03 . However, since not all events in the analysis are available in the catalogs, we proposed a new procedure to evaluate -value which is a modified version of the original one.
The results are shown in Tables 2, 3, 4 and 5. For bbh case of 1-OGC, if we call events with -value less than 0.05 significant, we have 3 significant events, GW150914, GW151226 and GW151012. This is fully consistent with the conclusion of 1-OGC paper N18 . We also found 2 marginally not significant events, 160103+05:48:36 and 151213+00:12:20, whose -value are and , respectively. Since for these events are and , these are not identified as marginal events in N18 .
For bbh case of 2-OGC, we have 13 significant events. All of them are also identified significant based on in N20 . There is one marginal event, 151205+19:55:25. The -value of this event is 0.07 but the computed in N20 is 0.525. Thus, -value suggests this is marginally not significant, while suggests this is marginally significant. It is not easy to conclude whether this signal is from astrophysical origin or not only from these results.
The method for estimating -value presented in this paper is very simple because we donβt need any assumptions on the distribution of noise and signal. Note that -value and are based on fundamentally distinct statistical disciplines. The -value is a frequentist measure, which is devised to estimate FDR of events over some threshold of significance without any assumptions on signals. In contrast, is a Bayesian measure, which is devised to estimate the posterior probability of astrophysical origin of a particular event relying on prior assumptions on signals. Nevertheless, from the results discussed above, we found that both approaches provide almost the same conclusion. The coincidence is not at all trivial. The coincidence would suggest that the prior assumptions on signals used in the computation of are close to the reality. It should be useful to estimate -value as well as in the gravitational wave searches. This should be true especially for marginal events like 151205+19:55:25 in this paper. We can obtain additional information on the significance of an event from different criterion.
We also note that the procedure for estimating -value presented in this paper can be applicable to other searches for gravitational waves. Our procedure for estimating -value is not restricted to the specific searches for the gravitational waves whose true background distribution of detection statistic is difficult to know, because our procedure is based on the empirical distribution, which is always available by time-shifting of time-series data of different detectors.
Acknowledgment
H.Y. and H.T. would like to thank Jishnu Suresh for fruitful discussion. We thank the authors of 1-OGC N18 and 2-OGC N20 for making the data set of the catalogs public. This work was supported by MEXT, JSPS Leading-edge Research Infrastructure Program, JSPS Grant-in-Aid for Specially Promoted Research 26000005, JSPS Grant-in-Aid for Scientific Research on Innovative Areas 2905: JP17H06358, JP17H06361, JP16H02183 and JP17H06364, JSPS Core-to-Core Program A. Advanced Research Networks, JSPS Grant-in-Aid for Scientific Research (S) 17H06133 and 15H00787, the joint research program of the Institute for Cosmic Ray Research, the cooperative research program of the Institute of Statistical Mathematics, National Research Foundation (NRF) and Computing Infrastructure Project of KISTI-GSDC in Korea, Academia Sinica (AS), AS Grid Center (ASGC) and the Ministry of Science and Technology (MoST) in Taiwan under grants including AS-CDA-105-M06, Advanced Technology Center (ATC) of NAOJ, Mechanical Engineering Center of KEK, the LIGO project, and the Virgo project.
Appendix A Derivation and meaning of
As in various scientific research fields ASA , there might be some confusion in use of -value in the gravitational wave community. In the recent American statistical association statement on -value ASA , the first principle is β-values can indicate how incompatible the data are with a specified statistical modelβ. Therefore, if we are saying about a -value, we always have to make clear what statistical model we are talking about. In this appendix, we discuss derivation and meaning of the conventional -value defined by (1), which is the probability of observing one or more noise events as strong as a signal whose detection statistic is under the noise model. In the analysis paper of the event GW150914 A16 , Abbott et al. called a -value, however, in the text we have not called it -value to avoid a possible confusion with the -value defined by (3).
Let us see more details of the probability (1) which was proposed by Usman et al. in Appendix of U16 . The total number of noise events in the observed data, , is modeled parametrically with the Poisson process of mean :
[TABLE]
where . The slight difference between the expression of in (1) and the expression in Eq. 17 of U16 (the unity in the numerator) comes from the fact that the model used by Usman et al. U16 involves observed events. In contrast, (1) is based only on the noise events in simulated background data, because the authors of the present paper believe that the noise model is better to be constructed by noise events only. In addition, Usman et al U16 considered a randomness in the number of candidate events and then marginalized them out. However, these steps have no influence on the final expression if (compare Equations A.4 and A.12 in U16 ). Then, the probability of observing one or more noise events as strong as a signal whose detection statistic is under the noise model during the observation time, , is given by (1). In the same manner, if we consider the probability of observing or more noise events as strong as a signal whose detection statistic is under the noise model during the observation time, the -value is
[TABLE]
Appendix B Discussion on
In this Appendix, we show that in Eq.(7) can be approximated to be . is an estimator of which indicates the overall proportion of noise events in the data. Setting is reasonable when very few events are expected to be signal, such as the gravitational wave search. In fact, Benjamini and Hochbergβs proposal BH95 was setting . On the other hands, for data in which some portion of events are expected to be signal, such as in genomewide studies, Storey and Tibshirani ST03 proposed , where , is an estimate of disccused in (14).
We consider a list of -values which contains -values less than a certain value, and set . We assume that the maximum -value in this list is . In this case, is the number of noise whose -values are between [math] and . We consider a function
[TABLE]
where . If all -values larger than are noise, for , since -values follow uniform distribution.
As an estimator of , let us consider a function
[TABLE]
where . If all -values larger than are noise, for . In particular, if all -values are noise, for .
Figure 6 is the plot of for the complete data set of 1-OGC. In this plot, we use events whose -value is less than 0.3. We can see that in Eq. (14) is almost unity for . We have in this region. This means that almost all -values are noise except for a very few -values around zero. Larger scatter in is due to the statistical fluctuation caused by the smaller number in the numerator of Eq. (14). Since we are mainly interested in the events with small -value less than , we set .
The situation is similar to the bbh case. Figure 6 is the plot of for for the bbh data set of 1-OGC. In this plot, we use events whose -value is less than 0.025. We have for . We have a larger deviation from unity, for . This is due to the statistical fluctuation caused by smaller number in the numerator of Eq. (14). From the same reason for the complete data set, we set .
Figure 8 is the plot of for the complete data set of 2-OGC. In this plot, we use 103,185 events whose -value is less than 0.07. We can see that in Eq. (14) is almost unity for . We have in this region. Small deviation from unity near is due to the statistical fluctuation.
Figure 8 is the plot of for for the bbh data set of 2-OGC. In this plot, we use 152,759 events whose -value is less than 0.10. We have for . Larger deviation from unity for is due to the statistical fluctuation. From the same reason as for 1-OGC data set, we set in Step 3 of Algorithm Β 1.
Appendix C Derivation of AlgorithmΒ 1 to estimate -values
In this appendix, we discuss the estimation procedure of -values. We first introduce an algorithm which is a slight modification of the procedure given in Remark B of Appendix of ST03 . The input is the list of detection statistics obtained from the observed data and detection statistics in simulated background data.
Algorithm 2**.**
Compute estimates of -values defined in (8).
Compute estimated -values
[TABLE]
where , is the detection statistic of the -th event and is given in (2).
- 2.
Let be the ordered -values.
- 3.
Set .
- 4.
For , compute
[TABLE]
- 5.
The estimated -value for the -th most significant event is defined in (8).
Since AlgorithmΒ 2 is our starting point to construct AlgorithmΒ 1, we reproduce it here. If we set , AlgorithmΒ 1 reduces to AlgorithmΒ 2.
Since -values in the region around and larger than are noise, if we take the threshold in , we obtain where and are defined in Table 1. Accordingly, (7) is
[TABLE]
which is monotonically increasing in . Therefore, we may replace Step 3 with . How Step 4
[TABLE]
still gives (8) for can be seen by induction. Assume (15) gives (8) for . Note that
[TABLE]
Let us show that (15) gives (8) for , namely,
[TABLE]
and the equality holds for some .
- β’
If , then . Note that the equality of (17) holds if . For ,
[TABLE]
For ,
[TABLE]
where the second last inequality holds from (15) and the last inequality holds from (16). Using the similar argument iteratively proves the assertion.
- β’
If , then . For ,
[TABLE]
For ,
[TABLE]
Suppose the second last equality holds, namely, the equality of (17) holds at . Then, for ,
[TABLE]
For ,
[TABLE]
Using the similar argument iteratively proves the assertion. If the second last equality of (18) does not hold, there exists some such that and
[TABLE]
because . The assertion can be shown in the similar manner.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) Abbott B.P. et al. (LIGO Scientific Collaboration, Virgo Collaboration) Observation of gravitational waves from a binary black hole merger. Phys. Rev. Lett. 116 , 061102 (2016)
- 2(2) Nitz A.H. et al., 1-OGC: The first open gravitational-wave catalog of binary mergers from analysis of public advanced LIGO data. Ap. J. 872 , 195 (2019).
- 3(3) Nitz A. H. et al.,2-OGC: Open Gravitational-wave Catalog of Binary Mergers from Analysis of Public Advanced LIGO and Virgo Data, Ap. J. 891 , 123 (2020).
- 4(4) The LIGO Scientific Collaboration, the Virgo Collaboration. GWTC-1: A gravitational-wave transient catalog of compact binary mergers observed by LIGO and Virgo during the first and second observing runs. Phys. Rev. X 9 , 031040 (2019)
- 5(5) Abbott B.P. et al. (LIGO Scientific Collaboration, Virgo Collaboration). GW 170817: Observation of Gravitational Waves from a Binary Neutron Star Inspiral. Phys. Rev. Lett. 119 , 161101 (2017)
- 6(6) Abbott B. et al. LIGO: the laser interferometer gravitational-wave observatory. Rept. Prog. Phys. 72 , 076901 (2009)
- 7(7) Accadia T. et al. Virgo: a laser interferometer to detect gravitational waves. Journal of Instrumentation 7 , P 03012 (2012)
- 8(8) LIGO Scientific Collaboration et al. Multi-messenger observations of a binary neutron star merger. Astrophys. J. Lett. 848 , L 12 (59pp) (2017)
