Burst search method based on likelihood ratio in Poisson Statistics

Ce Cai; Shao-Lin Xiong; Wang-Chen Xue; Yi Zhao; Shuo Xiao; Qi-Bin Yi,; Zhi-Wei Guo; Jia-Cong Liu; Yan-Qiu Zhang; Chao Zheng; Sheng-Lun Xie; Yan-Qi; Du; Xiao-Yun Zhao; Cheng-Kui Li; Ping Wang; Wen-Xi Peng; Shi-Jie Zheng,; Li-Ming Song; Xin-Qiao Li; Xiang-Yang Wen; Fan Zhang

arXiv:2302.11755·astro-ph.HE·February 24, 2023

Burst search method based on likelihood ratio in Poisson Statistics

Ce Cai, Shao-Lin Xiong, Wang-Chen Xue, Yi Zhao, Shuo Xiao, Qi-Bin Yi,, Zhi-Wei Guo, Jia-Cong Liu, Yan-Qiu Zhang, Chao Zheng, Sheng-Lun Xie, Yan-Qi, Du, Xiao-Yun Zhao, Cheng-Kui Li, Ping Wang, Wen-Xi Peng, Shi-Jie Zheng,, Li-Ming Song, Xin-Qiao Li, Xiang-Yang Wen, Fan Zhang

PDF

TL;DR

This paper introduces a Poisson-based likelihood ratio method for detecting weak and short X-ray and gamma-ray bursts, demonstrating its advantages over traditional Gaussian-based methods through simulations.

Contribution

It proposes a novel Poisson-based coherent search method for transient detection, improving sensitivity for weak bursts compared to Gaussian-based approaches.

Findings

01

Poisson-based method outperforms Gaussian for weak bursts

02

Likelihood ratio follows chi2 distribution, simplifying significance estimation

03

Enhanced detection capability for very low count transients

Abstract

Searching for X-ray and gamma-ray bursts, including Gamma-ray bursts (GRBs), Soft Gamma-ray Repeaters (SGRs) and high energy transients associated with Gravitational wave (GW) events or Fast radio bursts (FRBs), is of great importance in the multi-messenger and multi-wavelength era. Although a coherent search based on the likelihood ratio and Gaussian statistics has been established and utilized in many studies, this Gaussian-based method could be problematic for weak and short bursts which usually have very few counts. To deal with all bursts including weak ones, here we propose the coherent search in Poisson statistics. We studied the difference between Poisson-based and Gaussian-based search methods by Monte Carlo simulations, and find that the Poisson-based search method has advantages compared to the Gaussian case especially for weak bursts. Our results show that, for very weak…

Tables2

Table 1. Table 1: The result of estimation of best amplitude and likelihood ratio in Gaussian statistics and Poisson statistics.

number	$\hat{n}$	$r$	$d$	$S_{Poisson}$	$S_{Gaussian}$	$ℒ_{g} (S_{Gaussian})$	$ℒ_{g} (S_{Poisson})$	$ℒ_{p} (S_{Gaussian})$	$ℒ_{p} (S_{Poisson})$
1	2	9	10	0.89	0.83	15.20	15.19	8.07	8.09
2	2	9	11	1.00	0.95	19.40	19.39	9.74	9.75
3	2	9	9	0.78	0.72	11.51	11.49	6.52	6.53

Table 2. Table 2: Test result for significance in Gaussian statistics and Poisson statistics.

number	$S_{Poisson}$	$S_{Gaussian}$	$ℒ_{p}$	$ℒ_{g}$	$P_{Poisson}^{a} (ℒ_{p})$	$P_{Gaussian}^{b} (ℒ_{g})$	$P_{χ^{2}}^{c} (2 ℒ_{p})$
1	0.030	0.030	11.17	16.29	$2.51 \times 10^{- 6}$	$5.32 \times 10^{- 5}$	$2.28 \times 10^{- 6}$
2	0.031	0.031	9.47	15.62	$2.22 \times 10^{- 5}$	$6.74 \times 10^{- 5}$	$1.34 \times 10^{- 5}$
3	0.032	0.036	10.13	18.47	$7.96 \times 10^{- 6}$	$1.76 \times 10^{- 5}$	$6.75 \times 10^{- 6}$
4	0.035	0.034	12.17	17.51	$4.19 \times 10^{- 7}$	$2.38 \times 10^{- 5}$	$8.07 \times 10^{- 7}$
5	0.030	0.030	8.79	13.86	$3.93 \times 10^{- 5}$	$1.86 \times 10^{- 4}$	$2.75 \times 10^{- 5}$

Equations20

Λ = \frac{P ( d ∣ H _{1} ( s ))}{P ( d ∣ H _{0} )},

Λ = \frac{P ( d ∣ H _{1} ( s ))}{P ( d ∣ H _{0} )},

P (d_{i} ∣ H_{1}) = i = 1 \prod j \frac{1}{2 π σ _{d_{i}}} exp (- \frac{( d _{i} - r _{i} s ) ^{2}}{2 σ _{d_{i}}^{2}}) (i = 1, 2...j),

P (d_{i} ∣ H_{1}) = i = 1 \prod j \frac{1}{2 π σ _{d_{i}}} exp (- \frac{( d _{i} - r _{i} s ) ^{2}}{2 σ _{d_{i}}^{2}}) (i = 1, 2...j),

P (d_{i} ∣ H_{0}) = i = 1 \prod j \frac{1}{2 π σ _{n_{i}}} exp (- \frac{d _{i} ^{2}}{2 σ _{n_{i}}^{2}}),

P (d_{i} ∣ H_{0}) = i = 1 \prod j \frac{1}{2 π σ _{n_{i}}} exp (- \frac{d _{i} ^{2}}{2 σ _{n_{i}}^{2}}),

d_{i} = d_{i} - \overset{n}{^}_{i},

d_{i} = d_{i} - \overset{n}{^}_{i},

L_{g} = ln \frac{P ( d _{i} ∣ H _{1} )}{P ( d _{i} ∣ H _{0} )} = i = 1 \sum j [ln \frac{σ _{n_{i}}}{σ _{d_{i}}} + \frac{d _{i} ^{2}}{2 σ _{n_{i}}^{2}} - \frac{( d _{i} - r _{i} s ) ^{2}}{2 σ _{d_{i}}^{2}}] .

L_{g} = ln \frac{P ( d _{i} ∣ H _{1} )}{P ( d _{i} ∣ H _{0} )} = i = 1 \sum j [ln \frac{σ _{n_{i}}}{σ _{d_{i}}} + \frac{d _{i} ^{2}}{2 σ _{n_{i}}^{2}} - \frac{( d _{i} - r _{i} s ) ^{2}}{2 σ _{d_{i}}^{2}}] .

P (d_{i} ∣ H_{1}) = i = 1 \prod j \frac{( r _{i} s + n ^ _{i} ) ^{d_{i}}}{d _{i} !} exp (- (r_{i} s + \overset{n}{^}_{i}))

P (d_{i} ∣ H_{1}) = i = 1 \prod j \frac{( r _{i} s + n ^ _{i} ) ^{d_{i}}}{d _{i} !} exp (- (r_{i} s + \overset{n}{^}_{i}))

P (d_{i} ∣ H_{0}) = i = 1 \prod j \frac{( n ^ _{i} ) ^{d_{i}}}{d _{i} !} exp (- \overset{n}{^}_{i})

P (d_{i} ∣ H_{0}) = i = 1 \prod j \frac{( n ^ _{i} ) ^{d_{i}}}{d _{i} !} exp (- \overset{n}{^}_{i})

L_{p} = ln \frac{P ( d _{i} ∣ H _{1} )}{P ( d _{i} ∣ H _{0} )} = i = 1 \sum j [d_{i} ln (\frac{r _{i} s + n ^ _{i}}{n ^ _{i}}) - r_{i} s],

L_{p} = ln \frac{P ( d _{i} ∣ H _{1} )}{P ( d _{i} ∣ H _{0} )} = i = 1 \sum j [d_{i} ln (\frac{r _{i} s + n ^ _{i}}{n ^ _{i}}) - r_{i} s],

s_{true} = \frac{d - n ^}{r} .

s_{true} = \frac{d - n ^}{r} .

P_{T} = \frac{N _{LR > T}}{N _{total}},

P_{T} = \frac{N _{LR > T}}{N _{total}},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Burst search method based on likelihood ratio in Poisson Statistics

Ce Cai1,2,3, Shao-Lin Xiong2, Wang-Chen Xue2,3, Yi Zhao2,4, Shuo Xiao2,3,9,10, Qi-Bin Yi2,5, Zhi-Wei Guo2,6, Jia-Cong Liu2,3, Yan-Qiu Zhang2,3, Chao Zheng2,3, Sheng-Lun Xie2,7, Yan-Qi Du2,8, Xiao-Yun Zhao2, Cheng-Kui Li2, Ping Wang2, Wen-Xi Peng2, Shi-Jie Zheng2,3, Li-Ming Song2,3, Xin-Qiao Li2, Xiang-Yang Wen2, Fan Zhang2

1 College of Physics, Hebei Normal University, 20 South Erhuan Road, Shijiazhuang 050024, Hebei, China

2 Key Laboratory of Particle Astrophysics, Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China

3 University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100049, China

4 Department of Astronomy, Beijing Normal University, Beijing 100875, China

5 School of Physics and Optoelectronics, Xiangtan University, Xiangtan 411105, Hunan, China

6 College of physics Sciences Technology, Hebei University, No. 180 Wusi Dong Road, Lian Chi District, Baoding 071002, Hebei, China

7 Institute of Astrophysics, Central China Normal University, Wuhan 430079, Hubei, China

8 Information Science and Technology, Southwest Jiaotong University, Chengdu 610031, Sichuan, China

9 Guizhou Provincial Key Laboratory of Radio Astronomy and Data Processing, Guizhou Normal University, Guiyang 550001, GuiZhou, China

10 School of Physics and Electronic Science, Guizhou Normal University, Guiyang 550001, GuiZhou, China E-mail: [email protected]

(Accepted XXX. Received YYY; in original form ZZZ)

Abstract

Searching for X-ray and gamma-ray bursts, including Gamma-ray bursts (GRBs), Soft Gamma-ray Repeaters (SGRs) and high energy transients associated with Gravitational wave (GW) events or Fast radio bursts (FRBs), is of great importance in the multi-messenger and multi-wavelength era. Although a coherent search based on the likelihood ratio and Gaussian statistics has been established and utilized in many studies, this Gaussian-based method could be problematic for weak and short bursts which usually have very few counts. To deal with all bursts including weak ones, here we propose the coherent search in Poisson statistics. We studied the difference between Poisson-based and Gaussian-based search methods by Monte Carlo simulations, and find that the Poisson-based search method has advantages compared to the Gaussian case especially for weak bursts. Our results show that, for very weak bursts with very low number of counts, the Poisson-based search can provide higher significance than the Gaussian-based search, and its likelihood ratio (for background fluctuation) still generally follows the $\chi^{2}$ distribution, making the significance estimation of searched bursts very convenient. Thus, we suggest that the coherent search based on Poisson likelihood ratio is more appropriate in the search for generic transients including very weak ones.

keywords:

methods: data statistical – methods: data analysis – (stars:) gamma-ray burst: general

††pubyear: 2022††pagerange: Burst search method based on likelihood ratio in Poisson Statistics–Burst search method based on likelihood ratio in Poisson Statistics

1 Introduction

Recent discoveries of a Gamma-ray burst (GRB 170817A) associated with the Gravitational wave (GW 170817) (Abbott et al., 2017; Goldstein et al., 2017; Savchenko et al., 2017; Li et al., 2018) and a non-thermal X-ray burst from the Galactic magnetar (SGR J1935+2154) associated with the Fast radio burst (FRB 200428) (Li et al., 2021; CHIME/FRB Collaboration et al., 2020; Bochenek et al., 2020; Mereghetti et al., 2020; Tavani et al., 2020; Ridnaia et al., 2021) highlight the importance of the observations of high energy transients in X-ray and gamma-ray band.

As the ground-based gravitational wave observatories (LIGO, Virgo and KAGRA) continue to upgrade, their detected GW events would likely be further and the presumptive gamma-ray bursts associated with those GW events would probably be very weak. Also, as most of the detected FRBs are of extragalactic origin (e.g. Lorimer et al., 2007), if they are also associated with X-ray bursts from extragalactic magnetars (e.g. Yang et al., 2020), then these X-ray bursts are expected to be very weak and short. In fact, gamma-ray transients with short duration have been conceived to be the possible counterparts of FRBs (Guidorzi et al., 2020a). For example, Guidorzi et al. (2020a, b) searched Insight-HXMT data for FRB-associated gamma-ray counterparts with time scales down to milliseconds and even sub-milliseconds (i.e. 0.1 ms).

On the other hand, the bright and sharp peak of GRBs may appear in very short duration down to milliseconds due to the tip-of-the-iceberg effect (Li et al., 2016; Moss et al., 2022). It is also well known that GRBs tend to be shorter in duration in higher energy range band (e.g. Song et al., 2022). It is found that the duration of magnetar short bursts could be shorter than about 50 ms (Cai et al., 2022; Collazzi et al., 2015; Lin et al., 2020). As Hurley et al. (2005) pointed out, an extragalactic giant flar as bright as SGR1806–20 could appear as very short-duration depending on its distance owing to the tip-of-the-iceberg effect. In fact, much shorter (down to millisecond) bursts have been detected from the Earth (i.e. Terrestrial Gamma-ray flashes and Terrestrial Electron Beams, Fishman et al., 1994; Xiong et al., 2012).

We note that most gamma-ray detectors usually have temporal resolution of several microseconds, which is sufficient to detect sub-milliseconds bursts. For example, GECAM (Li et al., 2022; An et al., 2020) has a temporal resolution of about 0.1 $\mu$ s (Xiao et al., 2022b) which allow us to exploit some extreme short bursts. These short and weak GRB and SGR bursts would leave a small number of signal counts in detectors.

Meanwhile, during such a short duration of the burst, the background counts would be very small as well. For example, the average background level of each GRD detector of GECAM and each NaI detector of Fermi/GBM is about 1 count per 1 ms (in energy range of about 10-200 keV) (Lin et al., 2011; Xiao et al., 2022a). For each CsI detector of Insight-HXMT/HE, this number is about 0.6 counts per 1 ms (in energy range of about 80 - 800 keV) (Zhang et al., 2020; Liu et al., 2020).

Therefore, it is critically important to find and study these weak X-ray and gamma-ray bursts in the multi-messenger and multi-wavelength era. However, caution should be made in the analysis of weak bursts with very few counts. In fact, there are many dedicated studies dealing with weak signal and low counts statistics (Li & Ma, 1983; Kaastra, 2017; Hannam & Thompson, 1999; Hauschild & Jentschel, 2001).

The first step of studying weak bursts is finding them by burst search algorithm. In an earlier work, Blackburn et al. (2015) developed a sensitive coherent search method for targeted Fermi/GBM (Meegan et al., 2009) follow-up observation of GW events. This method has been used to follow up all LIGO triggers including sub-threshold GW triggers. It is also used to search for GRBs and magnetar bursts (Fletcher, 2021; Cai et al., 2021b). Several improvements have been made to this coherent search method by many authors (Goldstein et al., 2016; Goldstein et al., 2019; Cai et al., 2021a), including background estimation, spectral template, calculation speed, search sensitivity as well as rejection of false triggers caused by instrumental effects. However, this coherent search method is deduced with Gaussian statistics, which is a good approximation for bright bursts with sufficient number of counts, but may be problematic for weak bursts with very few counts.

Motivated by the generic search for all kinds of bursts, including the weak bursts, here we deduced the coherent search method based on maximum likelihood ratio in Poisson statistics. Then we compared the performance of the Poisson-based search method to the Gaussian-based one by detailed Monte Carlo simulations. In section 2, we describe the likelihood ratio based coherent search method in both Gaussian and Poisson statistics. Section 3 depicts the simulation and result. Finally, discussions and conclusions are presented in section 4 and 5.

2 Burst Search Methods

Burst search methods discussed here are based on the framework of likelihood ratio, which was first presented by Neyman and Pearson (Neyman & Pearson, 1928). The likelihood ratio is defined as:

[TABLE]

where $P(d|H_{1}(s))$ and $P(d|H_{0})$ are the likelihood functions for two hypothesis and models, $H_{1}(s)$ and $H_{0}$ , respectively. $d$ and $s$ represent observation data and model parameters, respectively. For simplicity, one usually use log-likelihood ratio ( $\mathcal{L}$ ) which is defined as: $\mathcal{L}={\rm ln}\,\Lambda$ .

Blackburn et al. (2015) developed a coherent search method based on the likelihood ratio to search GBM data for various bursts (Fletcher, 2021; Cai et al., 2021b). The likelihood ratio is defined as the ratio between probabilities (likelihood) of two hypothesis: (1) $H_{1}$ : the observed counts are contributed by the background plus burst signal; (2) $H_{0}$ : the observed data is purely background without any burst. This method has been improved by a series of studies (Goldstein et al., 2016; Goldstein et al., 2019; Kocevski et al., 2018) and widely used in the burst search of Fermi/GBM, Insight-HXMT and other instruments (e.g. Hamburg & others", 2020; Cai et al., 2021a).

In above studies, the likelihood function is formulated in the Gaussian statistics with the assumption of large number of counts. To deal with weak bursts with few counts, the likelihood can be rewritten in the Poisson statistics. Both the Gaussian and Poisson cases are presented as following.

2.1 Gaussian case

Following the presentation in Blackburn et al. (2015) and Cai et al. (2021a), the coherent search method based on the Gaussian statistics could be formulated as:

[TABLE]

where $i$ represents the number of data sets in each detectors and channels, $j$ is the total number of detectors and channels, ${d}_{i}$ and ${\sigma}_{{d}_{i}}$ are the observed data (counts) and standard deviation of the expected data (background+signal), respectively, $\hat{n}_{i}$ and ${\sigma}_{{n}_{i}}$ are the estimated background and the standard deviation of the background data, respectively, ${r}_{i}$ and $s$ represent the burst signal with default amplitude of 1 (i.e. un-normalized signal) and the intrinsic source amplitude, respectively, $\widetilde{d}_{i}$ is the background-subtracted data (i.e. net counts).

Then we can define the log-likelihood ratio (hereafter likelihood ratio for simplicity):

[TABLE]

where $\mathcal{L}_{\rm g}$ is the likelihood ratio for Gaussian statistics.

2.2 Poisson case

For the case of weak bursts with fewer counts, the calculation of likelihood should be based on the Poisson distribution, thus Eq.2, 3 and 5 could be rewritten as:

[TABLE]

where $\mathcal{L}_{\rm p}$ represent the likelihood ratio of Poisson distribution. Other parameters are defined the same as section 2.1.

As we know, when the mean values ( $\mu$ ) is large, the Poisson distribution can be well approximated by a Gaussian distribution with standard deviation of $\sqrt{\mu}$ . Also, the central limit theorem states that a variable should follow Gaussian distribution if it is the sum of many random variables that are independent identically distributed. Therefore, these two methods based on Poisson and Gaussian distribution may show significant difference only for those cases that the mean value is small and the number of variables that contribute to the sum of likelihood ratio is not large.

3 Simulations and Results

We implement a series of Monte Carlo simulations to investigate the difference between Poisson-based and Gaussian-based search methods presented in section 2.1 and section 2.2. Artificial data sets are created by the following steps:

(1)

Initialize the mean value of the background (i.e., $\hat{n}_{i}$ ) of each detector and each energy channel (i.e., $\sim$ 200 counts/s in 10–20 keV, $\sim$ 160 counts/s in 20–50 keV, $\sim$ 70 counts/s in 50–100 keV, $\sim$ 75 counts/s in 100-200 keV) using the GECAM data (Xiao et al., 2022a). 2. (2)

Assume there is no evolution of the mean value of the background over time within a short time window. The known (preset) mean value of the background can be used instead of the estimated background from a polynomial fit to simulated data 111In real observations, the mean value of the background is unknown, and one should use the estimated background and it’s uncertainty. PGSTAT (Poisson data with Gaussian background, (Arnaud et al., 2022) should be utilized in the process of searching the real observation data for burst signal. We note that performing the background estimation with PGSTAT statistic does not alter any of our main conclusions.. 3. (3)

Set the incident direction, flux amplitude, spectra shape and duration of burst source to known (simulated) values. 4. (4)

Calculate the burst signal by multiplying the detector response matrix of GECAM with the burst spectra (e.g. Band Function, Band et al., 1993). 5. (5)

Sample the data from Poisson distribution with $\lambda$ (expectation) equal to the mean value of the background (or the mean value of the background plus the burst signal).

The simulated data set is created by adding the burst signal to background, as shown in Figure 1. In order to assess the difference between Poisson-based and Gaussian-based search method, two different types of light curves have been studied by setting the number of data sets (i.e. detector and channel) to 1 or 100, as described in the next two subsections.

3.1 The case for one detector and

one channel

For the first set of simulations, we set the number of data sets (detector and channel) to 1 (i.e., $j=1$ ) for simplicity222It is possible that there is only 1 detector that has good observation for some very weak bursts.. To mimic the case for very weak and short bursts (e.g., burst duration of about 10 ms), the mean value of background is set to 2 counts (i.e., $\hat{n}_{i}=2$ in the GECAM energy range of 10–20 keV). The parameter of mean value of background is frozen in the following simulations of this subsection, unless otherwise stated.

We simulated the burst signal (i.e., $r_{i}s$ ) ranging from 1 counts to 24 counts. The observed counts (i.e., $d_{i}$ ) is background plus burst signal, and then the likelihood ratio (LR) is calculated using the known values of observed data counts, mean value of background, un-normalized signal and source amplitude( $d_{i}$ , $\hat{n}_{i}$ , $r_{i}$ , $s$ ). The LR as a function of the observed counts for the Poisson and Gaussian case is shown in the left panel of Figure 2, which shows that the LR in Gaussian statistics ( $\mathcal{L}_{\rm g}$ ) is significantly larger than that of Poisson case ( $\mathcal{L}_{\rm p}$ ) for large number of counts. This trend is opposite only for those cases with less than about 5 counts. We take the observed data of 2 counts and 20 counts as examples to show the difference of likelihood $P(d|H_{1}(s))$ and $P(d|H_{0})$ (see section 2 for more details of theses two likelihoods) between Poisson and Gaussian cases. As shown in the right panel of Figure 2, for high observed counts (e.g., 20 counts) and low background (e.g., 2 counts), $P(d|H_{0})$ is different between Poisson and Gaussian cases, while $P(d|H_{1}(s))$ is similar to each other. For low observed counts (e.g., 3 counts) and low background (e.g., 2 counts), both $P(d|H_{0})$ and $P(d|H_{1}(s))$ show some difference between Poisson and Gaussian cases.

As mentioned above, we assume that there is no evolution of the mean value of the background within a short time window (for a short burst). Therefore, we focus on the time bin of signal (e.g., $T_{0}\sim$ $T_{0}+0.01$ s) during simulation, as shown in the light curve of the right panel of Figure 3. The coherent search method of Gaussian case and Poisson case then are used to search for burst in the simulated time bin.

Assuming a source location and burst spectrum, the remaining free parameter of the likelihood ratio is burst amplitude $s$ . The likelihood ratio can be maximized by choosing a proper amplitude. Therefore, the maximum likelihood ratio corresponds to the best estimation of amplitude, which is considered to be the nearest (best) value of the true source amplitude. We test the consistency of the estimation of burst amplitude for these two search methods. The initialized mean value of signal is set to 9 counts. We run a Monte Carlo simulation and obtain a series of Poisson distributed data samples. These data samples, as unknown-amplitude observed data from several observations in the time bin of signal 333Samples from Poisson distribution with expectation equal to the mean value of background (2 counts) plus signal (9 counts)., are used to calculate the best amplitude with Newton’s method that maximize $\mathcal{L}_{\rm g}$ and $\mathcal{L}_{\rm p}$ , respectively.

We take three data groups (see Table 1 for details) as examples to show the $\mathcal{L}_{\rm g}$ and $\mathcal{L}_{\rm p}$ variation as a function of unknown amplitude of $s$ . As shown in Figure 4, for each data sample, the best values of $s$ to reach the maximum of $\mathcal{L}_{\rm g}$ and $\mathcal{L}_{\rm p}$ are different for Poisson and Gaussian cases. The true amplitude of one data sample can be calculated using the known values of the mean values of background, un-normalized signal and observed data ( $\hat{n}$ , $r$ , $d$ ):

[TABLE]

As a test example, for the case of $\hat{n}=2$ , $r=9$ and $d=10$ (see the red line of Figure 4 and Table 1), $s_{\rm true}$ equals to 0.89, which is consistent with the estimation of amplitude ( $S_{\rm Poisson}$ , 0.89) when maximize the likelihood ratio using the Poisson case. The value of $s_{\rm true}$ is a little different from the estimation of amplitude ( $S_{\rm Gaussian}$ , 0.83) when maximize the likelihood ratio using the Gaussian case. Given the values of $S_{\rm Poisson}$ and $S_{\rm Gaussian}$ , we can calculate the likelihood ratio based on Gaussian case ( $\mathcal{L}_{\rm g}(S_{\rm Gaussian})$ and $\mathcal{L}_{\rm g}(S_{\rm Poisson})$ ) using Eq. 5 and based on Poisson case ( $\mathcal{L}_{\rm p}(S_{\rm Gaussian})$ and $\mathcal{L}_{\rm p}(S_{\rm Poisson})$ ) using Eq. 8, respectively. We note that $\mathcal{L}_{\rm g}(S_{\rm Gaussian})$ is larger than $\mathcal{L}_{\rm g}(S_{\rm Poisson})$ and $\mathcal{L}_{\rm p}(S_{\rm Poisson})$ is larger than $\mathcal{L}_{\rm p}(S_{\rm Gaussian})$ , which shows that $S_{\rm Gaussian}$ is better than $S_{\rm Poisson}$ for Gaussian statistics and $S_{\rm Poisson}$ is better than $S_{\rm Gaussian}$ for Poisson statistics. The results of best amplitude and maximum LR from all three data groups are listed in Table 1.

It is usually not very reliable to claim the detection of a burst when there is only a small excess in one detector and one channel, so more detailed simulations and analyses with detection of multi-detectors and multi-channels are performed in next subsection.

3.2 The case for multiple detectors and channels

The most probable application of the coherent search method is the detection of weak burst sources with multi-detectors and multi-channels. We simulate the burst signal detected by 25 detectors and 4 channels (for each detector) of GECAM using the response matrix in previous work (Qiao et al., 2022). For simplification, we assume that the source has constant flux during burst time and the background also stay stable within short time windows. All 25 detectors have the same background level.

As mentioned in section 1, it is important to find and study weak and short bursts. We set the duration of the burst (e.g., magnetar bursts) to 10 ms (Cai et al., 2021b). Typically the fluences of magnetar bursts detected by GBM (or GECAM) mostly range from $\sim 10^{-8}$ erg cm*-2* to $\sim 10^{-6}$ erg cm*-2* in the energy range of 10 keV – 200 keV (Collazzi et al., 2015; Xiong et al., 2022). We set the spectrum and flux of the weak short burst in this simulation to the Power Law function444A simple photon power law with photon index of $\alpha$ and normalization of $S$ (in units of photons cm*-2* s*-1* keV*-1*): ${f}_{\rm PL}(E)=S\times{(\frac{E}{100})}^{-\alpha}$ and $1\times 10^{-8}$ erg cm*-2* s*-1*(10–200 keV). The expected counts of each GECAM detector for simulated burst are shown in Figure 5, which are added to the background to simulate the observed light curve. The maximum and minimum number of expected burst counts of each detector and channel are about $\sim$ 2 counts and $\sim$ 0 counts. The total number of expected burst counts and background are 63 counts and 150 counts, respectively.

The theoretical light curves of each GECAM detector for the weak burst are shown in the left panel of Figure 6. The simulated light curves are shown in the right panel of Figure 6, which are sampled from Poisson distribution with the expected values equal to the counts in each time bin of the theoretical light curves. There are some signals with very few counts (less than about 5 counts) of each channel and detector for such kind of short weak bursts. All counts of each channel and detector unit are summed together for this weak burst, as shown in Figure 7. It is shown that there are very limited excess counts in the observed light curve for such weak short burst.

The coherent search methods based on Poisson statistics and Gaussian statistics are used to search for this weak burst in the simulated light curve. We calculate the likelihood ratio ( $\mathcal{L}_{\rm g}$ and $\mathcal{L}_{\rm p}$ ) of the time bin of the burst signal, using the values of estimated background and un-normalized signal (with the known location and spectrum of the source preset). The likelihood ratio can be maximized by varying the amplitude. This results in the best amplitude of 0.03 for both the Poisson case and Gaussian case, which has a little different from the preset amplitude (0.04) due to the Poisson fluctuation in the simulation. The maximum likelihood ratios corresponding to the best amplitude are 11.17 and 16.29 for Poisson case and Gaussian case, respectively. The resulted absolute values of $\mathcal{L}_{\rm g}$ and $\mathcal{L}_{\rm p}$ are different, which is the same as the case for one detector and one channel (see section 3.1).

We calibrate the corresponding confidence level for these two specific likelihood ratios by simulating the procedure of searching for signals in the simulated light curve including only Poisson noise. Using the same background level of the simulated weak burst, we executed a series of simulations to investigate the LR distribution for both Poisson and Gaussian cases, as shown in Figure 8. We find that the LR of Poisson case ( $2\mathcal{L}_{\rm p}$ ) follows the $\chi^{2}$ distribution with the degree of freedom (d.o.f.) of 1, even for the extremely low counts. By contrast, the LR of the Gaussian case ( $2\mathcal{L}_{\rm g}$ ) significantly deviates from this $\chi^{2}$ distribution in low counts regime.

With this LR distribution, we can estimate the significance of a specific likelihood ratio (T) according to:

[TABLE]

where $N_{\rm LR>T}$ and $N_{\rm{total}}$ are the number of simulated events with LR larger than T and total number of simulation events, respectively.

For this simulated burst mentioned above, the LR of Poisson case (11.17) and Gaussian case (16.29) correspond to the confidence level (calculated using Eq. 10) of $2.51\times 10^{-6}$ and $5.32\times 10^{-5}$ , respectively.

We also run a series of simulations to assess the estimation of source amplitude and confidence level for Gaussian case and Poisson case with the same expected burst counts and background. Given the values of background, un-normalized signal and observed counts, we can calculate the best amplitudes corresponding to maximum $\mathcal{L}_{\rm g}$ and $\mathcal{L}_{\rm p}$ .

The source amplitude of Gaussian case is correlated with that of Poisson case, as shown in the left panel of Figure 9. The distributions of the source amplitude of Gaussian case and Poisson case are shown in the right panel of Figure 9. We find that there are no significant differences between the amplitude of Gaussian case and Poisson case.

Based on the result of the simulations with background only, the significance of these specific likelihood ratios ( $P_{\rm Gaussian}(\mathcal{L}_{\rm g})$ and $P_{\rm Poisson}(\mathcal{L}_{\rm p})$ ) can be calculated using Eq. 10. These detailed results mentioned above as well as the significance given by $\chi^{2}$ distribution at 2 $\mathcal{L}_{\rm p}$ (equal to the form of the likelihood ratio in Wilk’s theorem) are listed in Table 2. Interestingly, we find that, the significance of a burst given by Gaussian case is lower than that of the Poisson case.

High counts levels are also used to study the difference of LR between the Poisson case and Gaussian case. The simulated light curve of each detector and each channel are sampled from Poisson distribution with the expected value of high counts (more than 20 counts). We assume that the duration of the simulated burst is 300 ms. The mean values of the background (i.e, 60 counts, 60 counts, 30 counts, 30 counts) for each energy channel (i.e., 10–20 keV, 20–50 keV, 50–100 keV and 100–200 keV) correspond to the in-flight background level of GECAM with the duration of 300 ms. The total number of expected signal counts and background are set to 282 and 4500. Summed light curve of all 4 channels and 25 detectors for the simulated weak burst are shown in the left panel of Figure 10. For the time bin of this burst, the estimated best amplitude and the corresponding likelihood ratio is 0.05 and 29.14 for Poisson case, which is well consistent with the estimated amplitude and likelihood ratio of 0.05 and 31.64 for Gaussian case. We note that this result for high counts is different from the case of low counts (i.e., low level of background and signal) mentioned above. The corresponding confidence level for the case of high counts level are also calibrated. The LR distribution of pure background variation for Poisson case and Gaussian case are shown in the right panel of Figure 10. It shows that the LR distribution for these two statistics are generally agreement with each other and both of them follow the $\chi^{2}$ distribution with the degree of freedom (d.o.f.) of 1.

4 Discussions

In this paper, we proposed the coherent search based on the Poisson statistics to search for all kinds of bursts, especially for weak and short bursts. We implemented a series of simulations (see Figure 1 for simulation framework) to evaluate the difference between the coherent search methods based on Poisson statistics and Gaussian statistics.

Two different number of data sets (i.e., $j$ ) are used for simulations: one detector and one channel, twenty-five detectors and four channels. For the case of one detector and one channel, the absolute value of LR and estimation of amplitude can be simply and directly compared with each other, which is very important for understanding the intrinsic difference between these two statistics. However, it is more common to search out a burst with multi-detectors and multi-channels (e.g., 25 detectors and 4 channels). Therefore, multi-detectors and multi-channels based on GECAM data have been implemented in our simulations.

As mentioned in section 2, the LR based on Poisson and Gaussian distribution may show significant difference for those two cases when the mean value of counts is small and the number of variables that contribute to the sum of likelihood in not large, i.e. the Gaussian distribution is not a good approximation any more. Therefore, we focus on the low counts level of background for weak and short bursts. Furthermore, high counts level of background is also applied to validate the consistency of these two statistics.

4.1 Comparison of the absolute values of LR

For the case of one detector and one channel, the likelihood ratio based on Poisson statistics is different from that of the Gaussian statistics in the low level of background. To avoid the error of statistical fluctuations, the simulated observation data is set to the mean value of background plus the mean value of signal. As shown in the left panel of Figure 2, for low background level (during a short duration of the burst, e.g., 10 ms), the LR is different both at low observed counts (e.g., weak bursts) and high observed counts (e.g., bright bursts). Since the LR is calculated using two probabilities ( $P(d|H_{1}(s)$ , $P(d|H_{0})$ ) with different mean values, the differences of LR is the reflection of these two probabilities. The distributions of Poisson and Gaussian with different mean values in the linear scale and logarithm scale are shown in the right panel of Figure 2, which demonstrates that $P(d|H_{0})$ of Poisson case at low expected counts (e.g., the background level of 2 counts) and high observed counts (i.e., background plus signal) are very different from that of Gaussian case. With a fewer observed counts, $P(d|H_{1}(s))$ is also different between these two distributions. We also find that both $P(d|H_{1}(s))$ and $P(d|H_{0})$ are similar to each other at high expected background counts (e.g., the background level of 20 counts).

We compared the results of low counts region with different number of data sets (i.e. 25 detectors and 4 channels) and find that it does not alter any of our main conclusions about the difference of LR between Poisson case and Gaussian case.

4.2 Comparison of the estimation of burst amplitude

As an important parameter for studying the brightness of bursts, the amplitude can be calculated through maximizing the LR.

For the case of one detector and one channel, the estimation of source amplitude is unbiased even for extremely weak burst using Poisson statistics. However, for Gaussian statistics, the estimation of source amplitude is slightly biased for weak burst. The correctness and difference of the estimation of source amplitude using these two statistics are test by Monte Carlo simulations. Simulated light curves are sampled from the Poisson distribution with the mean value of theoretical light curves, as shown in Figure 3. For the case of one detector and one channel, three simulated data groups with the same mean values of background (2 counts) and un-normalized signal (9 counts) are used to compare the amplitude of Poisson case and Gaussian case. Figure 4 shows that there is a little difference between the best estimated amplitudes of Poisson case and Gaussian case. We calculate the true amplitude of each simulated data sample using Eq.9 and find that the estimated amplitude of Poisson case is well consistent with the true value. The detailed results are listed in Table 1, which shows that the estimated amplitudes of Poisson case and Gaussian case can maximize LR of Poisson case and Gaussian case, respectively. We note that using the estimated amplitudes of Poisson case can not maximize the LR of Gaussian case, and vice versa.

For the case of 25 detectors and 4 channels, a weak magnetar burst is simulated to compare the Poisson case and Gaussian case. Assuming the weak bursts are detected by GECAM, it results in the maximum counts of 2 and the minimum counts of 0 in each detector and channel, as shown in Figure 5. Figure 6 and Figure 7 show the light curve of this simulated burst with low level of background and signal. We find that the best amplitude of Poisson case for this simulated burst is in agreement with the amplitude of Gaussian case. Figure 9 also show that there are no significant differences between the amplitude of Gaussian case and Poisson case with a series of simulations 555We note that Eq.9 is not appropriate for the case of multi-detectors and multi-channels with the consideration of the optimized weighted light curve (see Cai et al. (2021a) for more details). .

4.3 Comparison of the confidence level

Two common techniques to estimate the significance of observed burst signals are utilized in many studies: analytical calculation with formulae and direct calculation with simulations. According to Wilks’s theorem, the likelihood ratio (equal to the twice of the likelihood ratio in this work) approaches to the $\chi^{2}_{1}$ distribution as the number of data samples tends to infinity. Thus, the significance can be roughly estimated with $\chi^{2}_{1}$ distribution in the case of large data samples.

Our simulations show that, in low counts regime (e.g. less than 10 counts), only the LR of the simulated data from background fluctuations in Poisson statistics (i.e. $2\mathcal{L}_{\rm p}$ ) generally follow the $\chi^{2}_{1}$ distribution, while the LR in Gaussian case deviate the $\chi^{2}_{1}$ distribution significantly, as shown in Figure 8. We also find that in high counts regime, the LR of Gaussian case is well consistent with that of Poisson case, as shown in Figure 10.

Since the observation data are generally limited by the sensitivity of detectors, background level and the brightness of source, one must be cautious to use the Wilks’s theorem to calculate the significance of a burst candidate. Here, we make use of simulations with low background level to assess the confidence level for Poisson case and Gaussian case. For the case of multi-detectors and multi-channels, the significance of the simulated burst (e.g figure 7) are different between Poisson statistics and Gaussian statistics. Our results show that the Poisson-based search can provide higher significance of the burst than the Gaussian-based search (see table 2).

A weak burst with high background level is also simulated to validate the consistency between Poisson case and Gaussian case, as shown in Figure 10. We find that the LR, estimation of source amplitude and confidence level for Poisson case are well consistent with those of Gaussian case.

5 Conclusions

In this paper, we proposed the coherent search based on likelihood ration with Poisson statistics. We compared the likelihood ratio values given by Poisson and Gaussian statistics, derived the best amplitude of the burst using Newton’s method and presented the results of simulations for different setting of background and burst signal levels, including the results of pure background fluctuations.

We find that the Poisson-based search method has advantages than that based on Gaussian statistics, especially for weak and short bursts. When the counts number is very low (which is usual for very short burst down to ms time scale), the Poisson-based search can provide higher significance than the Gaussian-based search and its likelihood ratio still follows the $\chi^{2}$ distribution, which provides a fast estimation of the burst significance.

Although these two methods should be basically equal for bright bursts with large number of counts, to deal with general bursts including weak and short ones, we suggest that the Poisson-based coherent search (see section 2.2) should be used in transients search and study.

Acknowledgements

This work is supported by the National Key R&D Program of China (2021YFA0718500). We thank supports from the Strategic Priority Research Program on Space Science, the Chinese Academy of Sciences (Grant No. XDA15360102, XDA15360300, XDA15052700) , the National Natural Science Foundation of China (Grant No. 12173038 , U2038106) and the National HEP Data Center (Grant No. E029S2S1).

Data Availability

The data and codes underlying this article will be shared on reasonable request to the corresponding author.

Bibliography48

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Abbott et al. (2017) Abbott B. P., et al., 2017, Phys. Rev. Lett. , 119, 161101 · doi ↗
2An et al. (2020) An Z. H., et al., 2020, Radiation Detection Technology and Methods · doi ↗
3Arnaud et al. (2022) Arnaud K., Gordon C., et al. 2022, XSPEC Users’ Guide for version 12.12.1
4Band et al. (1993) Band D., et al., 1993, Ap J. , 413, 281 · doi ↗
5Blackburn et al. (2015) Blackburn L., Briggs M. S., Camp J., Christensen N., Connaughton V., Jenke P., Remillard R. A., Veitch J., 2015, Astrophys. J. Suppl. , 217, 8 · doi ↗
6Bochenek et al. (2020) Bochenek C. D., Ravi V., Belov K. V., Hallinan G., Kocz J., Kulkarni S. R., Mc Kenna D. L., 2020, Nature , 587, 59–62 · doi ↗
7CHIME/FRB Collaboration et al. (2020) CHIME/FRB Collaboration et al., 2020, Nature , 587, 54 · doi ↗
8Cai et al. (2021 a) Cai C., et al., 2021 a, Monthly Notices of the Royal Astronomical Society · doi ↗