TL;DR
This study independently tests for annual modulation signals in COSINE-100 dark matter data using Bayesian and information-theoretic methods, finding no significant evidence for such modulation.
Contribution
It introduces the first application of Bayesian and information theory techniques to assess annual modulation significance in COSINE-100 data.
Findings
Information theory tests favor a constant background over modulation.
Bayesian analysis strongly supports the background-only model.
No significant annual modulation signal detected in the data.
Abstract
We perform an independent search for annual modulation caused by dark matter-induced scatterings in the recently released COSINE-100 data. We test the hypothesis that the data contains a sinusoidal modulation against the null hypothesis that the data consists of only background. We compare the significance using frequentist, information theoretic techniques (such as AIC and BIC), and also using the Bayesian model comparison technique. The information theory-based tests mildly prefer a constant background over a sinusoidal signal with the same period as that found by the DAMA collaboration. The Bayesian test however strongly prefers a background model. This is the first proof of principles demonstration of application of Bayesian and information theory based techniques to COSINE-100 data to assess the significance of annual modulation.
| (cpd/keV/kg) | (cpd/keV/kg) | (days) | |
|---|---|---|---|
| Crystal 2 | 2.48 | 0.90 | 995.59 |
| Crystal 3 | 0.00 | 3.77 | 3675.20 |
| Crystal 4 | 2.52 | 1.50 | 485.11 |
| Crystal 6 | 2.05 | 0.77 | 1000.20 |
| Crystal 7 | 1.90 | 1.01 | 1000.05 |
| (cpd/keV/kg) | (cpd/keV/kg) | (days) | (cpd/keV/kg) | (radians/day) | (days) | |
|---|---|---|---|---|---|---|
| Crystal 2 | 2.0 | 0.88 | 994.55 | |||
| Crystal 3 | 0.01 | 3.76 | 3675.01 | |||
| Crystal 4 | 2.61 | 1.47 | 421.46 | 0.013 | 0.024 | 235.32 |
| Crystal 6 | 2.05 | 0.77 | 1008.91 | |||
| Crystal 7 | 1.91 | 0.99 | 993.22 |
| (cpd/keV/kg) | (cpd/keV/kg) | (days) | (cpd/keV/kg) | (days) | |
|---|---|---|---|---|---|
| Crystal 2 | 2.48 | 0.90 | 995.79 | ||
| Crystal 3 | 0.00 | 3.77 | 3674.63 | ||
| Crystal 4 | 2.51 | 1.52 | 486.23 | 0.009 | 133.2 |
| Crystal 6 | 2.03 | 0.79 | 1001.07 | ||
| Crystal 7 | 1.88 | 1.04 | 1000.21 |
| Frequentist | ||
|---|---|---|
| /DOF | 174.2/180 | 170.9/177 |
| p.d.f. | 0.0207 | 0.0208 |
| -value | 0.34 | |
| significance | 0.42 | |
| AIC | 204.2 | 206.8 |
| AIC | 2.6 | |
| BIC | 253.3 | 265.8 |
| BIC | 12.5 | |
| -16 | ||
| Frequentist | ||
|---|---|---|
| /DOF | 174.2/180 | 172.6/178 |
| p.d.f. | 0.0207 | 0.0209 |
| -value | 0.44 | |
| significance | 0.14 | |
| AIC | 204.2 | 206.6 |
| AIC | 2.4 | |
| BIC | 253.3 | 262.2 |
| BIC | 8.9 | |
| -7 | ||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
An independent assessment of significance of annual modulation in COSINE-100 data
Aditi Krishak1
Shantanu Desai2
1 Department of Physics, Indian Institute of Science Education and Research Bhopal, Madhya Pradesh 462066, India
2Dept. of Physics, Indian Institute of Technology Hyderabad, Kandi, Telangana 502285, India
Abstract
We perform an independent search for annual modulation caused by dark matter-induced scatterings in the recently released COSINE-100 data. We test the hypothesis that the data contains a sinusoidal modulation against the null hypothesis that the data consists of only background. We compare the significance using frequentist, information theoretic techniques (such as AIC and BIC), and also using the Bayesian model comparison technique. The information theory-based tests mildly prefer a constant background over a sinusoidal signal with the same period as that found by the DAMA collaboration. The Bayesian test however strongly prefers a background model. This is the first proof of principles demonstration of application of Bayesian and information theory based techniques to COSINE-100 data to assess the significance of annual modulation.
1 Introduction
Although about 25% of the universe’s matter density consists of cold dark matter (Planck Collaboration et al., 2018), we have no clue about the mass of the dark matter particle or its non-gravitational couplings (Jungman et al., 1996). The most theoretically favored and widely studied cold matter candidate is the Weakly Interacting Massive Particle (or WIMP) (Lee and Weinberg, 1977). A large number of experiments have been taking data for more than 30 years to look for direct signatures of WIMP-nucleon interactions in underground laboratory-based experiments (Schumann, 2019). Among these, only the DAMA/LIBRA experiment has detected an annual modulation, having all the right characteristics of been induced by WIMPs in our galaxy (Freese et al., 1988), with a statistical significance of about 12 (Bernabei et al., 2018). However, the WIMP parameter space inferred from the DAMA/LIBRA results is ruled out by many other direct detection experiments. Although many attempts (for eg. Herrero-Garcia et al. 2012; Catena et al. 2016; Nobile et al. 2015; Herrero-Garcia et al. 2018; Kang et al. 2019, and references therein) have been made to reconcile the results of DAMA with the null results of other experiments using non-standard particle physics or astrophysics assumptions, the jury is still out on whether any of them can satisfactorily reconcile with the latest results from all the direct detection experiments. The only possible resolution out of this conundrum could be that, no other direct detection experiment with null results used the same target material as DAMA, viz. thallium-doped NaI. The COSINE-100 experiment (Adhikari et al., 2019) is one of the first experiments, whose detector is designed to be a replica of the DAMA target, and hence can confirm or refute their annual modulation claims in a model-independent fashion. Many other experiments, designed to do a similar test of the DAMA annual modulation such as DM-Ice17 (Barbosa de Souza et al., 2017), KIMS (Kim et al., 2019), SABRE (Antonello et al., 2019), and ANAIS-112 (Amaré et al., 2019) are also about to start taking data and the ANAIS experiment has released preliminary results.
In a recent work (Krishak et al., 2019), we did an independent assessment of the DAMA/LIBRA annual modulation claims from their most recent data release, using three disparate model comparison techniques: frequentist (Desai, 2016), Bayesian (Trotta, 2017; Kerscher and Weller, 2019), and information theoretic techniques (Liddle, 2004, 2007). The Bayesian and information theoretical techniques are widely used for model comparison in Astrophysics and Cosmology, but rarely used in direct dark matter detection experiments. In this work, we apply the same techniques to the recently released data from the COSINE-100 experiment (Adhikari et al., 2019).
The outline of this paper is as follows. A brief summary of the COSINE-100 results can be found in Sect. 2. Our own re-analysis is described in Sect. 3. We conclude in Sect. 4. We do not provide any details of the theory behind the different model comparison techniques used herein, which can be found in Krishak et al. (2019) and references therein. Our analysis codes and results can be found on a github link, whose url is provided in Sect. 4.
2 Recap of Cosine-100 results
We provide a brief recap of the main results in Adhikari et al. (2019) (CS100 hereafter), wherein more details can be found. The COSINE-100 experiment is located at the Yangyang underground laboratory in South Korea under more than 700 m of rock overburden. The experiment consists of eight NaI crystals (labeled C1 to C8) doped with thallium and was designed to mimic the DAMA/LIBRA setup as closely as possible. Out of these, data from three crystals was omitted due to various systematics, as discussed in CS100. Data taking commenced in October 2016 and the results released in CS100 correspond to a total exposure of 97.7 kg years. The count rates for the five crystals used for the analysis can be found in Fig. 3 of CS100. The event rates were fit to the following functional form:
[TABLE]
The first two terms in Eq. 1, consisting of the constant and exponential decay are used for parameterizing the background rates and the last cosine term is a potential signature of annual modulation caused by dark matter interactions. The data from all the crystals were simultaneously fit to the same values of the cosine function parameters, but separately for , and using minimization. Their results are consistent within with both the null hypothesis of no oscillation as well as with the DAMA/LIBRA annual modulation best-fit values in the 2-6 keV range. The best fit parameters for different scenarios (phase fixed as well as floating) can found in Table 1 of CS100.
3 Our Analysis
For our analysis, we obtained the data points and the errors associated with them from the COSINE-100 collaboration. The data consists of event rates for crystals 2, 3, 4, 6, and 7 in the 2-6 keV energy bin in 15-day intervals. We first fit only the background rates (first two terms in Eq. 1) to the data and determine the best-fit values for , and ; this model is assumed to be our null hypothesis , i.e.,
[TABLE]
We then determine the estimates of the best-fit parameters of the sinusoidal modulation in Eq. 1, and this is considered as the hypothesis to be tested, viz. . These two models are compared using frequentist, information theory (AIC and BIC), and Bayesian model comparison techniques. More details about these techniques have been recently reviewed in Krishak et al. (2019) and references therein, and we skip these details for brevity.
3.1 Parameter Estimation
Parameter estimation for the models under consideration is the first step towards model comparison analysis. The data points consist of experimental errors in the event rates (). For the model with only the background signal, we find the best-fit values of the parameters using minimization for each crystal separately. The functional between the data () and the model function () is given by:
[TABLE]
where denotes the COSINE-100 event rate in time bin for each crystal, and is the model is defined in Eq. 2. All the background parameters are kept free, with a positive constraint (lower bound of ) on all of them; and the best-fit values obtained for each crystal by minimization are summarized in Table 1.
For the model with a sinusoidal modulation (where H(t) is defined in Eq. 1)), the minimization is done concurrently for all the crystals by using the same values of , , and for all the crystals, while the background parameters can be different for each crystal. We first do a minimization keeping all the 18 parameters free. In this case, since the time bin width we have used is equal to 15 days, we are not sensitive to periods less than 15 days. Therefore, while doing the fits, we have also used a lower bound on the period, equal to 15 days. The optimization is done using the SLSQP (Nocedal and Wright, 2006) constrained optimization algorithm as implemented in scipy Python module, keeping a positive lower bound on all background parameters, as well as on the amplitude and frequency. The best fits obtained are listed in Table 2. These fits along with the data can be found in Fig. 1. The best-fit value for is about 0.024 radian/day (corresponding to a period of about 257 days).
For testing the DAMA annual modulation claim, we also carry out optimization of this model by keeping the modulation frequency fixed at the DAMA obtained value of 0.0172 radians/day (or period fixed at 365.25 days). We then redo the minimization with 17 free parameters (which is one less than before), with lower bound on all background parameters and on the amplitude. The best-fit for this optimization can be found in Table 3.
The values obtained by us for the background parameters for each of the crystals in both the cases, i.e. keeping all parameters free (Table 2) and then keeping fixed (Table 3), differ significantly from those obtained by the COSINE-100 collaboration111Although these values are not displayed in CS100, these were obtained by private communication with the authors of CS100. , which is due to the degeneracy between the background parameters , and in Eq. 1.
We now present model comparison results using both these fits.
3.2 Model Comparison
3.2.1 Frequentist Model Comparison
We carry out frequentist model comparison by first calculating the values using Eq. 3 with the best-fit parameters for each model, summed over all the data points for all five crystals. Then, using the best-fit and degrees of freedom, the goodness of fit for each model can be calculated from the p.d.f. The model with the greater value of p.d.f. would be considered as the favored model.
Making use of the fact that the two models are nested, we use Wilk’s theorem (Wilks, 1938) to quantify the -value of the cosine model as compared to the background model. For our example, the difference in between the two models satisfies a distribution with degrees of freedom equal to three. From the cumulative distribution of , we obtain the -value from the c.d.f. The corresponding significance or -score is calculated using the prescription in Cowan et al. (2011). High -value and low -score indicate weak evidence against the null hypothesis. The values per degree of freedom and the model likelihood given by the p.d.f. calculated for each model for both the cases ( varying and fixed) can be found in Tables 4 and 5 respectively along with the -value and -score. As we can see, for both the cases (with varying and fixed), the difference in between the two hypothesis is negligible. The significance of annual modulation with the same period as DAMA data is negligible (less than 1.)
3.2.2 Information Criteria
The Akaike Information Criterion value (AIC) is given by (Liddle, 2007):
[TABLE]
The Bayesian Information Criterion is given by (Liddle, 2007):
[TABLE]
where is the number of free parameters, is the minimum value and is the total number of data points. The model with the smaller value of AIC and BIC is preferred. We then calculate the difference in AIC and BIC values between the and hypothesis, and evaluate the significance using the qualitative strength of evidence rules given in Shi et al. (2012). For the case with all modulation parameters free, the AIC and BIC values are tabulated in Table 4. We see that the modulation hypothesis has much smaller values for AIC and BIC, when is a free parameter. However, according to the strength of evidence rules, AIC and BIC have to be greater than 10 for the model with the smaller value to be decisively favored compared to the other. When is a free parameter, this criterion is not satisfied for AIC, so the better model cannot be decisively favored, whereas BIC test favors the null hypothesis. For the case of modulation period fixed, the AIC and BIC values are tabulated in Table 5. We get smaller values for both AIC and BIC for the null hypothesis of background-only model in this case. However, since the AIC and BIC values are less than 10, they also do not decisively favor any one model over the other. According to strength of evidence rules (Shi et al., 2012), BIC shows strong evidence for the background only hypothesis. Therefore, the background only hypothesis is mildy preferred using the information theory based tests.
3.2.3 Bayesian Model Comparison
We carry out a Bayesian model comparison by calculating the Bayes factor for the model in comparison to the hypothesis. Here, we consider the null hypothesis () to be and the cosine model () to be . The Bayes factor is given by (Trotta, 2017):
[TABLE]
where and are the marginal likelihood or Bayesian evidence for and respectively given data . Similar to the previous model comparison tests, we calculated the Bayes factor for two cases: when is fixed at the DAMA best-fit value as well as when is a free parameter. Unlike the previous three tests, this statistic does not use the best-fit value of the parameters.
We first calculate the Bayesian evidence for both and using the multi-threaded Dynesty package (Speagle, 2019) in Python, which uses the Dynamical Nested Sampling algorithm for calculating the Bayesian evidence (Feroz et al., 2009; Mukherjee et al., 2006). The likelihood function () for the combined data from all the five crystals, given the model and a set of parameters, is assumed to be a Gaussian:
[TABLE]
where is in the form described in Eq. 1, is the total number of data points in each crystal, and the outer product is over the five crystals used for the analysis. We then multiply the likelihood by priors for all the background parameters as well as for and and (when it is kept as a free parameter).
We choose uniform priors between , , and for the background parameters , and respectively. These bounds are conservative and cover a huge swath of parameter space for the background parameters. Outside these bounds, the calculation of Bayesian evidence also does not always converge. For the signal parameters of the sinusoid, we use uniform priors for between , and for . When is kept as a free parameter, we choose a uniform prior between rad/day, which corresponds to periods between 15 (bin size) and 600 days (maximum duration of the dataset).
The values of the Bayes factor for both the fits can be found in Tables 4 and 5. We find that in both the cases the Bayes factor is less than 1, indicating that the background model is favored over the cosine-based fit. We use the Jeffrey’s scale (Trotta, 2017) for a qualitative interpretation of the Bayes factor. Since , in both the cases, this provides a strong evidence in favor of the null hypothesis.
4 Conclusions
Recently, the COSINE-100 experiment, designed to test the DAMA/LIBRA annual modulation hypothesis, released their first results from their search for annual modulation, induced from dark matter scatterings, using 1.7 years of data, with a total exposure of 97.7 kg years (Adhikari et al., 2019). They find that the data in the 2-6 keV energy interval is consistent with both the null hypothesis of no modulation as well as with the DAMA estimate of amplitude and phase at 68.3% c.l.
In this work, we apply (similar to the analysis done in Krishak et al. (2019) for the DAMA/LIBRA data) three independent model comparison techniques, viz. frequentist, Bayesian and information theory-based, to test the compatibility of the data with annual modulation over a background-only hypothesis.
For the signal hypothesis, we did two different sets of fits. For one fit, we kept the period (or angular frequency) same as the DAMA best-fit value of one year or 0.0172 radians/day. For the other fit, the period was also kept as a free parameter.
Our results using all the three techniques are tabulated in Tables 4 and 5 respectively. When is a free parameter, the BIC test decisively prefer the background-only model, whereas the significance of both the hypotheses is almost the same for the AIC and frequentist test, and so its not possible to favor any one model from these tests.
When the period is fixed to the DAMA best-fit value of 1 year, we find that the BIC test strongly favors the background only hypothesis, but the difference in BIC does not cross the threshold of 10 for it to be decisively favored. With the frequentist test, the difference is negligible. With more data it remains to be seen if the significance increases with the frequentist and information theory based tests.
On the other hand, when we do the model comparison using Bayesian method of computing the Bayes factor, we find that the data strongly favors a constant background over a cosine fit (irrespective of whether is free or not).
This is the first proof of principle application of Bayesian and information theory based model comparison techniques to the COSINE-100 data and is complementary to the statistical tests done in the COSINE-100 results paper. To promote transparency in data analysis, we have made our analysis codes and data publicly available, which can be found at https://github.com/aditikrishak/COSINE100_analysis.
5 Acknowledgements
Aditi Krishak has been supported by DST-INSPIRE fellowship. We are grateful to Jay Hyun Jo and the COSINE-100 collaboration for providing us the raw data used in Fig 2 and sharing with us their best-fit values of the background parameters. We acknowledge useful correspondence with Josh Speagle regarding the dynesty algorithm. We are also grateful to the anonymous referee for constructive feedback on the manuscript.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Planck Collaboration et al. (2018) Planck Collaboration, N. Aghanim, Y. Akrami, M. Ashdown, J. Aumont, C. Baccigalupi, M. Ballardini, A. J. Banday, R. B. Barreiro, and N. Bartolo, ar Xiv e-prints , ar Xiv:1807.06209 (2018), ar Xiv:1807.06209 [astro-ph.CO] .
- 2Jungman et al. (1996) G. Jungman, M. Kamionkowski, and K. Griest, Physics Reports 267 , 195 (1996) , hep-ph/9506380 . · doi ↗
- 3Lee and Weinberg (1977) B. W. Lee and S. Weinberg, Physical Review Letters 39 , 165 (1977) . · doi ↗
- 4Schumann (2019) M. Schumann, Journal of Physics G Nuclear Physics 46 , 103003 (2019) , ar Xiv:1903.03026 [astro-ph.CO] . · doi ↗
- 5Freese et al. (1988) K. Freese, J. Frieman, and A. Gould, Phys. Rev. D 37 , 3388 (1988) . · doi ↗
- 6Bernabei et al. (2018) R. Bernabei, P. Belli, A. Bussolotti, F. Cappella, V. Caracciolo, R. Cerulli, C.-J. Dai, A. d’Angelo, A. Di Marco, H.-L. He, A. Incicchitti, X.-H. Ma, A. Mattei, V. Merlo, F. Montecchia, X.-D. Sheng, and Z.-P. Ye, Nuclear Physics and Atomic Energy 19 , 307 (2018) , ar Xiv:1805.10486 [hep-ex] . · doi ↗
- 7Herrero-Garcia et al. (2012) J. Herrero-Garcia, T. Schwetz, and J. Zupan, Phys. Rev. Lett. 109 , 141301 (2012) , ar Xiv:1205.0134 [hep-ph] . · doi ↗
- 8Catena et al. (2016) R. Catena, A. Ibarra, and S. Wild, J. Cosmology Astropart. Phys 2016 , 039 (2016) , ar Xiv:1602.04074 [hep-ph] . · doi ↗
