TL;DR
This paper performs a statistical meta-analysis of neutron lifetime measurements, comparing different central estimates and error distributions, and highlights a persistent discrepancy between measurement methods.
Contribution
It introduces median and weighted mean estimates for neutron lifetime and analyzes their error distributions, revealing a significant discrepancy between measurement techniques.
Findings
Median neutron lifetime estimate: 881.5 ± 0.47 seconds
Error distributions fit better with Student's t and Cauchy distributions
Discrepancy between beam and bottle measurements persists at 4-8σ significance
Abstract
We calculated the median as well as weighted mean central estimates for the neutron lifetime, from a subset of measurements compiled in the 2019 update of the Particle Data Group (PDG). We then reconstruct the error distributions for the residuals using three different central estimates and then check for consistency with a Gaussian distribution. We find that although the error distributions using the weighted mean as well as median estimate are consistent with a Gaussian distribution, the Student's and Cauchy distribution provide a better fit. This median statistic estimate of the neutron lifetime from these measurements is given by seconds. This can be used as an alternate estimate of the neutron lifetime. We also note that the discrepancy between beam and bottle-based measurements using median statistics of the neutron lifetime persists with a significance…
| Reference | Neutron Lifetime (secs) | Type | Comment |
|---|---|---|---|
| Ezhov 18 Ezhov et al. (2018) | Bottle | Only in PDG19 | |
| Serebrov 17 Serebrov et al. (2018) | Bottle | Only in PDG19 | |
| Pattie 17 Pattie et al. (2018) | Bottle | Only in PDG19 | |
| Leung 16 Leung et al. (2016) | Bottle | Neither PDG18 nor PDG19 | |
| Arzumanov 15 Arzumanov et al. (2015) | Bottle | PDG | |
| Yue 13 Yue et al. (2013) | Beam | Only in PDG18 | |
| Steyerl 12 Steyerl et al. (2012) | Bottle | PDG | |
| Pichlmaier 10 Pichlmaier et al. (2010) | Bottle | PDG | |
| Serebrov 05 Serebrov et al. (2005) | Bottle | PDG | |
| Byrne 96 Byrne et al. (1996) | Beam | Only in PDG18 | |
| Mampe 93 Mampe et al. (1993) | Bottle | PDG | |
| Alfikmenov 90 Alfimenkov et al. (1990) | Bottle | PDG (but not used) | |
| Kossakowski 89 Kossakowski et al. (1989) | Beam | PDG (but not used) | |
| Paul 89 Paul et al. (1989) | Bottle | PDG (but not used) | |
| Last 88 Last et al. (1988) | Beam | PDG (but not used) | |
| Spivak 88 Spivak (1988) | Beam | PDG (but not used) | |
| Kosvintsev 86 Kosvintsev et al. (1986a) | Bottle | PDG (but not used) | |
| Kosvintsev 80 Kosvintsev et al. (1986b) | Bottle | PDG (but not used) | |
| Christensen 72 Christensen et al. (1972) | Beam | PDG (but not used) |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A meta-analysis of neutron lifetime measurements
Ashwani Rajan1
E-mail: [email protected]
Shantanu Desai2
E-mail: [email protected]
1Department of Physics, Indian Institute of Technology, Guwahati, Assam-781039, India
2Department of Physics, Indian Institute of Technology, Hyderabad, Telangana-502285, India
Abstract
We calculated the median as well as weighted mean central estimates for the neutron lifetime, from a subset of measurements compiled in the 2019 update of the Particle Data Group (PDG). We then reconstruct the error distributions for the residuals using three different central estimates and then check for consistency with a Gaussian distribution. We find that although the error distributions using the weighted mean as well as median estimate are consistent with a Gaussian distribution, the Student’s and Cauchy distribution provide a better fit. This median statistic estimate of the neutron lifetime from these measurements is given by seconds. This can be used as an alternate estimate of the neutron lifetime. We also note that the discrepancy between beam and bottle-based measurements using median statistics of the neutron lifetime persists with a significance between 4-8, depending on which combination of measurements is used.
pacs:
97.60.Jd, 04.80.Cc, 95.30.Sf
I Introduction
The precise measurement and theoretical estimate of the neutron lifetime is of paramount importance for both particle physics and astrophysics (Wietfeldt and Greene, 2011; Wietfeldt, 2014). The current weighted average of seven neutron lifetime measurements, reported in the 2019 version of the Particle Data Group (Tanabashi et al., 2018) (PDG, hereafter)111At the time of writing, the 2019 PDG update on neutron lifetime measurements is only available online at http://pdg.lbl.gov/2019/listings/rpp2019-list-n.pdf. The published version Tanabashi et al. (2018) contains listings from 2018. using seven best measurements is seconds. At face value, the weighted mean error from these measurements is equal to 0.4 seconds. Therefore, the reduced value for a constant neutron lifetime is equal to 14.6 for six degrees of freedom, corresponding to a -value of 0.023 (Press et al., 1992). If we define the significance as the number of standard deviations a Gaussian variable would fluctuate in one direction corresponding to this value, then the observed -value corresponds to a (Ganguly and Desai, 2017) discrepancy for a constant value of the neutron lifetime. Therefore, the PDG has scaled the weighted mean error by a scale factor equal to , where is the total degrees of freedom. With this multiplicative scale factor of 1.6, the total error is now equal to the reported value of 0.6 seconds. Therefore, the subset of neutron lifetime measurements vetted by the PDG are inconsistent with a constant value at 2 significance.
The theoretical neutron lifetime is a function of the axial vector to vector coupling ratio as well as the CKM matrix element Fornal and Grinstein (2018); Czarnecki et al. (2018). The most recent theoretical estimate of the neutron lifetime is between 875.3 and 891.2 seconds, within (Fornal and Grinstein, 2018). Theoretical uncertainties in the neutron lifetime calculation, and expected improvements in the near future have been recently reviewed in Ref. Czarnecki et al. (2018).
Neutron lifetime measurement techniques can be broadly classified into two types: ‘bottle’ and ‘beam’ based measurements. In the bottle method, ultra-cold neutrons are stored in a container (which consists of either some bottle or a trap), and the neutron lifetime is measured by fitting the surviving neutrons to a decaying exponential. In the beam method on the other hand, the number of neutrons and protons are produced from -decay, and the lifetime is obtained from the neutron decay rate. More details about these techniques can be found in Refs. Wietfeldt and Greene (2011); Wietfeldt (2014).
However, there is a long standing discrepancy between these two methods used for neutron lifetime measurements (Greene and Geltenbort, 2016). As of 2018, the current value from two beam experiments Byrne et al. (1996); Yue et al. (2013) included in the 2018 edition of PDG 222These two measurements are not used for the neutron lifetime estimate by the 2019 PDG edition. is equal to seconds (Fornal and Grinstein, 2018), and the same from five bottle experiments (Mampe et al., 1993; Serebrov et al., 2005; Pichlmaier et al., 2010; Steyerl et al., 2012; Arzumanov et al., 2015) is equal to seconds (Fornal and Grinstein, 2018). This is a formally a 4 discrepancy, and as pointed out in Fornal and Grienstein. Fornal and Grinstein (2018) (F18 hereafter) could either be evidence of uncontrolled systematics or could point to new physics. Another possibility however not mentioned in the above works is that the measurements could contain non-Gaussian errors, and consequently the weighted mean cannot be used as the central estimate.
The central estimate of the neutron lifetime mentioned in PDG as well as all other works, which analyze this discrepancy has been obtained from a weighted average of all the measurements. The central estimate of a quantity using weighted measurements makes the following main assumptions Gott et al. (2001): (i) individual data points are statistically independent and contain no systematic effects ; (ii) the errors are Gaussianly distributed. If any of the measurements contain catastrophic outliers or unaccounted systematic effects, then the second assumption is automatically violated. In that case, the weighted mean can produce extremely biased results. On the other hand, median statistics does not incorporate the individual measurements errors, and hence is unaffected by the presence of a few outliers. Secondly, even if the errors are not correctly estimated, as shown using simulations of Zeldovich’s thought experiment involving watches Bethapudi and Desai (2017), median estimate gives a more robust estimate. Even if a dataset is drawn from a distribution with infinite variance such as Cauchy distribution, the median is a more robust central estimate Gott et al. (2001). Many additional pitfalls in using the weighted mean as a central estimate, and how using the median value ameliorates these problems can be found in Refs. Gott et al. (2001); Bethapudi and Desai (2017) and references therein. The only assumption used for median statistic based estimate is that the measurements are independent and free of systematic errors.
In the last decade, Ratra and collaborators have shown that the error distributions for a whole slew of astrophysical and cosmological measurements are inconsistent with a Gaussian distribution (Gott et al., 2001; Chen and Ratra, 2003, 2011; Chen et al., 2003; Crandall et al., 2015; Crandall and Ratra, 2015, 2014; Bethapudi and Desai, 2017; Rajan and Desai, 2018; Penton et al., 2018; Camarillo et al., 2018a). The datasets they explored for this purpose include measurements of Chen et al. (2003), Lithium-7 measurements Crandall et al. (2015) (see also Zhang (2017)), distance to LMC Crandall and Ratra (2015), distance to galactic center Camarillo et al. (2018b), Deuterium abundance Penton et al. (2018), etc. For each of these datasets, they have fit the data to a variety of probability distributions. From all these studies, they inferred that the error distribution is non-Gaussian. Consequently, they have argued that median statistics should be used for the central estimates of these parameters instead of the weighted mean (Gott et al., 2001; Bethapudi and Desai, 2017). To the best of our knowledge, no one has investigated the Gaussianity of the neutron lifetime measurements (or for that matter any other datasets in PDG). The importance of doing such tests has been stressed in a number of works Gott et al. (2001); Crandall and Ratra (2014); Rajan and Desai (2018); Bailey (2017). Due to the non-Gaussanity of the error residuals for the aforementioned astrophysical datasets, median statistics has been used to obtain central estimates of some of these quantities such as Hubble Constant Gott et al. (2001); Chen and Ratra (2011); Bethapudi and Desai (2017), Newton’s Gravitational Constant Bethapudi and Desai (2017), mean matter density Chen and Ratra (2003), and other cosmological parameters Crandall and Ratra (2014). Alternately, one can use the method recently proposed by Cowan, where the uncertainity in the systematic errors has been modeled using probabilistic distributions Cowan (2019).
Given the importance of the physics implications of these discrepancies in the neutron lifetime measurements, and to obtain a more robust estimate, which can be easily compared with the theoretical estimate, we revisit the issue of checking for non-Gaussianity of the errors and to obtain a more robust central estimate from the vetted measurements in PDG. The outline of this manuscript is as follows. The dataset used for our analysis is described in Sect. II. Our analysis procedure and results are described in Sect. III. We discuss discrepacy between beam and bottle-based measurements in Sect. IV. We conclude in Sect. V.
II Neutron lifetime data
We briefly review the neutron lifetime measurements used for this analysis. The 2019 edition of PDG lists a total of 27 measurements from 1972 to present. From these measurements, only seven have been used by the PDG to obtain the central estimate. Using these seven measurements, a weighted mean central value of s was estimated, wherein the error has been rescaled by a factor of 1.6. All of these are bottle-based experiments. The corresponding value from the 2018 PDG edition was s, with five of them been bottle-based and two beam-based. The remaining measurements were ignored either because the error bars for some of the pre-1980 measurements were large, or if the results from the old measurements were reanalyzed, and lastly because some of the measurements were withdrawn. However, a few measurements have also been culled without any explanation. For our analysis, we also include all older measurements, except if they were reanalyzed or withdrawn. We also include one additional measurement (Leung et al., 2016), which was not included in either the 2018 or 2019 PDG. In all, we have collected a total of 19 measurements for our analysis, which are tabulated in Table 1. We note that in addition to these direct experimental measurements of neutron lifetime, there are also cosmological constraints on the measurements of neutron lifetime (Salvati et al., 2016). But we do not include them for our analysis, as these results are model-dependent, and not direct experimental measurements.
III Analysis
The first step in analyzing the Gaussianity of the error measurements of a dataset is to obtain a central estimate using the available data. For this analysis, we use all the 19 measurements tabulated in Table 1. We do not check for Gaussianity of the beam and bottle-based measurements separately, as the total number of data points in each category is too small for a robust test. However, once the number of measurements in each category grows, this should also be tested to check for systematics in each category. We note that in P18, a similar analysis was done using 15 deuterium abundance measurements. Similar to the works by Ratra et al (eg. Ref. (Penton et al., 2018), P18 hereafter), we consider two central estimates: weighted mean and the median. For this analysis, we use all the 19 measurements tabulated in Tab. 1.
The median value () corresponds to the 50% percentile value, for which half of the data points are below and half above. The standard deviation of the median depends upon the distribution from where it is sampled from. A number of methods have been proposed in literature to calculate the sample variance of the median Woodruff (1952); Maritz and Jarrett (1978); Price and Bonett (2001). For this work, to estimate the 68% confidence interval on the median, we use the methodology in P18, based on Gott et al Gott et al. (2001), as the estimate is made using only the data and is independent of the sampling distribution. The weighted mean central value () using the observed neutron lifetime measurements () is given by Bevington and Robinson (1992):
[TABLE]
where denotes the total error in each measurement. The total weighted mean error is given by
[TABLE]
From the measurements in Table 1, the weighted mean estimate is found to be seconds, and the median estimate is calculated to be seconds.
III.1 Error Distributions
Once we have a central estimate for the neutron lifetime () using one of the above three methods, we calculate the residual error using Penton et al. (2018); Camarillo et al. (2018b)
[TABLE]
In the above equation, is the error in the central estimate and is the error in the individual measurement. Similar to Refs. Penton et al. (2018); Camarillo et al. (2018b, a), we denote our error distribution for the median () and the weighted mean() calculated from Eq 3 by and respectively. If the central estimate is determined from the weighted mean, one must also account for correlations and the modified version of the error distribution, which accounts for these correlations is given by Camarillo et al. (2018b)
[TABLE]
Each of these three sets of histograms is then symmetrized around zero. We now fit the symmetrized histogram of to multiple probability distributions as described in the next section.
III.2 Fits to probability distributions
We fit the symmetrized histograms for each of the ’s to a Gaussian distribution as well as to variants of Gaussian distributions, such as Cauchy, Laplacian, and Student’s distribution, to see which of these is most compatible with the data. This is similar in spirit to recent works by Ratra et al, such as P18 and references therein. We briefly review this procedure. More details can be found in P18.
The Gaussian distribution we consider has zero mean and standard deviation equal to unity
[TABLE]
The second distribution we consider is the Laplacian distribution, which has a sharp peak and longer tails than a Gaussian distribution and is described by
[TABLE]
The third distribution, which we will use is the Cauchy or Lorentz distribution. It has longer and thicker tails compared to a Gaussian distribution. It is described by
[TABLE]
Finally, we use the Student’s distribution characterized by (which is sometimes referred to as “degrees of freedom”) and is given by
[TABLE]
For , the Student’s distribution is same as the Cauchy distribution, and is equal to Gaussian distribution for . For our analysis, we vary from 2 to 2000. Note that the Students- distribution for the error residuals can be obtained by modeling the error in systematic errors as a gamma distribution (Cowan, 2019).
In addition to comparing the error distributions to the PDFs in Eqs. 5, 6, 7, 8, which mainly depend on , we also compare to these distributions, after replacing by , where is an arbitrary scale factor, which we vary from 0.001 to 2.5 in steps of size 0.01.
The comparison is done using the one-sample unbinned Kolmogorov-Smirnov (K-S) test Ivezić et al. (2014). The K-S test is based on the -statistic, which measures the maximum distance between two cumulative distributions. The K-S test is widely used in both astrophysics and particle physics, for comparison of a dataset to a wide range of probability distributions, as it is agnostic to the distribution against which it is been tested, and does not depend on the size of the sample. Furthermore, critical values based upon the -statistic have been calculated in the literature and can be easily computed for any value of . This test is also invariant to reparameterization of the data. The one-sample K-S test can therefore serve as a goodness-of-fit test. Although some concerns have been raised regarding the incorrect usage of K-S test in astrophysics literature, as well as other caveats and limitations of this test Babu and Feigelson (2006), these do not apply in our case, and hence we use the K-S test to evaluate the compatibility of the error residuals with various distributions. In this case, the two distributions are the error histograms and the parent PDF to which it is compared. From the statistic, the K-S test also provides a -value, whose analytic formula can be found in any statistic work Ivezić et al. (2014); Penton et al. (2018). For this work, we have used the scipy module in Python for the computations. Higher the -value, more similar are the two distributions, whereas a low -value indicates an inconsistency between the distributions. Our results for comparison with all the four distributions are summarized in Table 2.
We find that for all three estimates, the Gaussian distribution is not the best fit, unless the scale factor is different from unity. The data are much more consistent with Cauchy or Student’s distribution. However, none of the -values for the Gaussian distribution are small enough to reject the null hypothesis.
IV Discrepancy between beam and bottle measurements
We now quantify the significance of the discrepancy between beam and bottle-based experiments using central estimates based on the median statistics. We do this analysis using three different combinations of datasets for beam and bottle based experiments. A summary of these comparisons can be found in Table 3.
We first use the same datapoints as in F18 (Fornal and Grinstein, 2018), who argued for a discrepancy. We obtain a median estimate using the same bottle-based experiments considered in F18 Mampe et al. (1993); Serebrov et al. (2005); Pichlmaier et al. (2010); Steyerl et al. (2012); Arzumanov et al. (2015), and compare the same with the beam-based experiments therein (Byrne et al., 1996; Yue et al., 2013). The median lifetime of the five bottle-based experiments along with the median error bar is given by seconds. The corresponding lifetime for the two beam-based experiments considered in F18 is seconds. Since, it is not possible to obtain a median error estimate with just two measurements, we do not quote its median uncertainty. The results do not change even after including the two additional bottle-based measurements (Serebrov et al., 2018; Pattie et al., 2018) not used for their average. Therefore considering the median statistics estimates, the discrepancy is about .
If we do this comparison by including all the measurements in Table 3, the median lifetime for all the bottle-based experiments is equal to seconds. The corresponding number for all the beam-based experiments is seconds. Therefore, comparing the median estimates between the beam and bottle-based measurements amounts to a 3.79 discrepancy.
If we then redo this comparison for a subset of all measurements in Table 1, having total error less than 10 seconds, the median central estimate for all bottle-based experiments is secs. Since the total number of beam-based measurements in Table 1 is a very small number (three), we only can obtain a central estimate, which is equal to 889.2 seconds. Therefore, the total discrepancy is about .
Hence, we infer that the discrepancy between beam and bottle-based measurements persists, even when median statistics is used for the central estimate of the neutron lifetime.
V Conclusions
There has been a long-standing discrepancy in literature related to the neutron lifetime measurements between the two different techniques, viz. bottle and beam-based methods. As of 2019, the current discrepancy is about Fornal and Grinstein (2018). To get some insight into these issues, we carried out an extensive meta-analysis of the vetted neutron lifetime measurements compiled in literature. We first use a compilation of 19 measurements of the neutron lifetime and their corresponding errors listed in the 2019 edition of PDG (Tanabashi et al., 2018) (cf. Table 1), in order to ascertain the non-Gaussianity of the residuals and to obtain a central estimate. The error distributions were analyzed in the same way as previously done for a variety of astrophysical measurements by Ratra et al Penton et al. (2018); Camarillo et al. (2018b, a). For this purpose, the central estimate was obtained using both the weighted mean (with and without correlations) as well as the median value. The median estimate does not incorporate the errors in the neutron lifetime. We then fit these residuals to four distributions, viz. Gaussian, Laplace, Cauchy, and Student’s distribution. The resulting fits are tabulated in Table 2.
We conclude from these observations, that none of the -values (obtained using all the three central estimates) are small enough to reject the Gaussian distribution for the error residuals. However, the Student’s and Cauchy distributions provide a more robust fit than the Gaussian distribution.
Therefore, more data points are necessary to robustly determine if the error residuals are consistent with a Gaussian distribution. Nevertheless, it would be useful exercise to obtain the central estimate of the neutron lifetime with median statistic, and to check if the discrepancy between beam and bottle-based measurements persists using median statistics. This median value along with 1 error bars using the 19 measurements, which we obtain is given by seconds. This estimate is complementary to the PDG-based result obtained using weighted mean statistic, which includes the addition of an ad-hoc scale factor. This value can be used as an alternate estimate of the observed neutron lifetime, and used for comparison with the theoretical estimate, which is currently between 875.3 and 891.2 seconds within Fornal and Grinstein (2018). Furthermore, this median value provides an alternate central estimate of the neutron lifetime, which can be used for comparison with theoretical estimates.
We then used the median estimate to evaluate the statistical significance of the discrepancy between beam and bottle-based measurements. When we use the same measurements as in F18, the discrepancy exacerbates to 6. If we consider all the measurements in Table 1, the discrepancy becomes (), depending on whether we include (exclude) measurements in this, with total error less than 10 seconds.
Acknowledgements.
We are grateful to Tomasso Dorigo for his nice blog article about the F18 paper, which brought our attention to this problem. We also thank Bharat Ratra for explaining in detail the methodology used in P18 and also in his earlier works.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Wietfeldt and Greene (2011) F. E. Wietfeldt and G. L. Greene, Reviews of Modern Physics 83 , 1173 (2011).
- 2Wietfeldt (2014) F. E. Wietfeldt, in 8th International Workshop on the CKM Unitarity Triangle (CKM 2014) Vienna, Austria, September 8-12, 2014 (2014), eprint 1411.3687.
- 3Tanabashi et al. (2018) M. Tanabashi et al. (Particle Data Group), Phys. Rev. D 98 , 030001 (2018).
- 4Press et al. (1992) W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical recipes in C. The art of scientific computing (1992).
- 5Ganguly and Desai (2017) S. Ganguly and S. Desai, Astroparticle Physics 94 , 17 (2017), eprint 1706.01202.
- 6Fornal and Grinstein (2018) B. Fornal and B. Grinstein, Physical Review Letters 120 , 191801 (2018).
- 7Czarnecki et al. (2018) A. Czarnecki, W. J. Marciano, and A. Sirlin, Phys. Rev. Lett. 120 , 202002 (2018), eprint 1802.01804.
- 8Greene and Geltenbort (2016) G. L. Greene and P. Geltenbort, Scientific American 314 , 36 (2016).
