Measurement of the $B^0_s\to\mu^+\mu^-$ branching fraction and effective lifetime and search for $B^0\to\mu^+\mu^-$ decays
LHCb collaboration: R. Aaij, B. Adeva, M. Adinolfi, Z. Ajaltouni, S., Akar, J. Albrecht, F. Alessio, M. Alexander, S. Ali, G. Alkhazov, P. Alvarez, Cartelle, A.A. Alves Jr, S. Amato, S. Amerio, Y. Amhis, L. An, L. Anderlini,, G. Andreassi, M. Andreotti, J.E. Andrews

TL;DR
This paper reports the first observation of the rare decay $B^0_s o\mu^+\mu^-$ with a measured branching fraction and lifetime, and sets an upper limit on $B^0 o\mu^+\mu^-$, all consistent with the Standard Model.
Contribution
The paper presents the first observation of $B^0_s o\mu^+\mu^-$ decay in a single experiment, including its branching fraction and effective lifetime, and provides a new upper limit for $B^0 o\mu^+\mu^-$.
Findings
Observation of $B^0_s o\mu^+\mu^-$ with 7.8 sigma significance
Measured branching fraction ${ m B}(B^0_s o\mu^+\mu^-) = (3.0 imes 10^{-9})$
First measurement of the $B^0_s o\mu^+\mu^-$ effective lifetime
Abstract
A search for the rare decays and is performed at the LHCb experiment using data collected in collisions corresponding to a total integrated luminosity of 4.4 fb. An excess of decays is observed with a significance of 7.8 standard deviations, representing the first observation of this decay in a single experiment. The branching fraction is measured to be , where the first uncertainty is statistical and the second systematic. The first measurement of the effective lifetime, ps, is reported. No significant excess of decays is found and a 95 % confidence level upper limit, , is determined. All results are…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 2
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH (CERN)
CERN-EP-2017-041
LHCb-PAPER-2017-001
March 16, 2017
Measurement of the branching fraction and effective lifetime and search for decays
The LHCb collaboration†††Authors are listed at the end of this paper.
A search for the rare decays and is performed at the LHCb experiment using data collected in collisions corresponding to a total integrated luminosity of 4.4. An excess of decays is observed with a significance of standard deviations, representing the first observation of this decay in a single experiment. The branching fraction is measured to be , where the first uncertainty is statistical and the second systematic. The first measurement of the effective lifetime, , is reported. No significant excess of decays is found and a 95% confidence level upper limit, , is determined. All results are in agreement with the Standard Model expectations.
Published in Phys. Rev. Lett. 118 (2017), 191801
© CERN on behalf of the LHCb collaboration, licence CC-BY-4.0.
Within the Standard Model (SM) of particle physics, the and decays are very rare, because they only occur through loop diagrams and are helicity-suppressed. Since they are characterised by a purely leptonic final state, and thanks to the progress in lattice QCD calculations [1, 2, 3], their time-integrated branching fractions, = and = [4], are predicted in the SM with small uncertainty. These features make the decays sensitive probes for physics beyond the SM, for example an extended Higgs sector [5, 6, 7]. The measurement of these processes has attracted considerable theoretical and experimental interest, culminating in the recent observation of the decay and evidence of the decay reported by the LHCb and CMS collaborations [8]. This has been obtained by combining their datasets collected in collisions in 2011 and 2012 [9, 10]. The measured branching fractions, = and = , are consistent with SM predictions. The ATLAS collaboration has also recently reported a search for these decays [11].
In the system, the light and heavy mass eigenstates are characterised by a sizable difference between their decay widths, [12]. In the SM, only the heavy state decays to , but this condition does not necessarily hold in New Physics scenarios [13]. The contributions from the two states can be disentangled by measuring the effective lifetime, which, in the search for physics beyond the SM, is a complementary probe to the branching fraction measurement. The effective lifetime is defined as , where is the decay time of the or meson and . The relation [14]
[TABLE]
holds, where is the mean lifetime and [12, 15]. The parameter is defined as , with . The complex coefficients and define the mass eigenstates of the system in terms of the flavour eigenstates (see, e.g., Ref. [12]), and ( is the () decay amplitude. In the SM the quantity is equal to unity but can assume any value in the range in New Physics scenarios.
This Letter reports measurements of the and time-integrated branching fractions, which supersede the previous LHCb results [9], and the first measurement of the effective lifetime. Results are based on data collected with the LHCb detector, corresponding to an integrated luminosity of 1 of collisions at a centre-of-mass energy , 2 at and 1.4 recorded at . The first two datasets are referred to as Run 1 and the latter as Run 2.
At various stages of the analysis multivariate classifiers are employed to select the signal. In particular, after trigger and loose selection requirements, candidates are classified according to their dimuon mass and the output variable, BDT, of a multivariate classifier based on a boosted decision tree [16], which is employed to separate signal and combinatorial background. The signal yield is determined from a fit to the dimuon mass distribution of candidates and is converted into a branching fraction using as normalisation modes the decays and , with (inclusion of charge-conjugated processes is implied throughout this Letter).
The analysis strategy is similar to that employed in Ref. [9] and has been optimised to enhance the sensitivity to both and decays to . This is achieved through a better rejection of misidentified -hadron decays such as (where ) and the development of an improved boosted decision tree for the BDT classifier. The effective lifetime is measured from the background-subtracted decay-time distribution of signal candidates in the lowest-background BDT region as defined later. To avoid potential biases, candidates in the dimuon mass signal region () were not examined until the analysis procedure was finalised.
The LHCb detector is a single-arm forward spectrometer covering the pseudorapidity range , described in detail in Refs. [17, 18]. It includes a high-precision tracking system consisting of a silicon-strip vertex detector, surrounding the interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet. Particle identification is provided by two ring-imaging Cherenkov detectors, an electromagnetic and a hadronic calorimeter, and a muon system composed of alternating layers of iron and multiwire proportional chambers. The simulated events used in this analysis are produced using the software described in Refs. [19, *Sjostrand:2007gs, *LHCb-PROC-2010-056, *Lange:2001uf, *Agostinelli:2002hh, *Allison:2006ve, *LHCb-PROC-2011-006, 26].
Candidate events for signal and normalisation are selected by a hardware trigger followed by a software trigger [27]. The candidates are predominantly selected by single-muon and dimuon triggers. The candidates are selected in a very similar way, the only difference being a different dimuon mass requirement in the software trigger. Candidate decays are used as control and normalisation channels.
The candidates are reconstructed by combining two oppositely charged particles with transverse momentum with respect to the beam, , satisfying , momentum , and high-quality muon identification [28]. Compared to the previous analysis, the muon identification requirements are tightened such that the misidentified background is reduced by approximately 50%, while the signal efficiency decreases by about 10%. The muon candidates are required to form a secondary vertex with a vertex-fit per degree of freedom smaller than 9 and separated from any primary interaction vertex (PV) by a flight distance significance greater than 15. Only muon candidate tracks with for any PV are selected, where is defined as the difference between the vertex-fit of the PV formed with and without the particle in question. In the selection, candidates must have a decay time less than , with respect to the PV for which the is minimal (henceforth called the PV), and a dimuon mass in the range . A candidate is rejected if either of the two candidate muons combined with any other oppositely charged muon candidate in the event has a mass within 30 of the mass [15]. The normalisation channels are selected with almost identical requirements to those applied to the signal sample. The selection is the same as that of , except that the muon identification criteria are replaced with hadron identification requirements. The decay is reconstructed by combining a muon pair, consistent with a from a detached vertex, and a kaon candidate with for all PVs in the event. These selection criteria are completed by a loose requirement on the response of a multivariate classifier, described in Ref. [29] and unchanged since then, applied to candidates in both signal and normalisation channels. The classifier takes as input quantities related to the direction of the candidate, its impact parameter with respect to the PV, the separation between the final-state tracks, and their impact parameters with respect to any PV. After the trigger and selection requirements 78 241 signal candidates are found, which form the dataset for the subsequent branching fraction measurement.
The separation between signal and combinatorial background is achieved by means of the BDT variable, where the boosted decision tree is optimised using simulated samples of events for signal and of events for background. The classifier combines information from the following input variables: , where and are the azimuthal angle and pseudorapidity differences between the two muon candidates; the minimum of the two muons with respect to the PV; the angle between the candidate momentum and the vector joining the decay vertex and PV; the candidate vertex-fit and impact parameter significance with respect to the PV. In addition, two isolation variables are included, to quantify the compatibility of the other tracks in the event with originating from the same hadron decay as the signal muon candidates. Most of the combinatorial background is composed of muons originating from semileptonic -hadron decays, in which other charged particles may be produced and reconstructed. The isolation variables are constructed to recognise these particles and differ in the type of tracks being considered: the first considers tracks that have been reconstructed both before and after the magnet, while the second considers tracks reconstructed only in the vertex detector. The isolation variables are determined based on the proximity of the two muon candidates to the tracks of the event and are optimised using simulated and events. The proximity of each muon candidate to a track is measured using a multivariate classifier that takes as input quantities such as the angular and spatial separation between the muon candidate and the track, the signed distance between the muon-track vertex and the candidate or primary vertex, and the kinematic and impact parameter information of the track.
The BDT variable is constructed to be distributed uniformly in the range [0,1] for signal, and to peak strongly at zero for background. Its correlation with the dimuon mass is below 5%. Compared to the multivariate classifier used in the previous measurement [9], the combinatorial background with is reduced by approximately 50%, mainly due to the improved performance of the isolation variables.
The expected BDT distributions are determined from those of decays in data after correcting them for distortions due to trigger and muon identification. An additional correction is made for the signal, assuming the SM prediction, to account for the difference between the and lifetimes, which affects the BDT distribution. The mass distribution of the signal decays is described by a Crystal Ball function [30]. The peak values for the and mesons are obtained from the mass distributions of and samples, respectively. The mass resolutions as function of mass are determined with a power-law interpolation between the measured resolutions of charmonium and bottomonium resonances decaying into two muons. The Crystal Ball radiative tail is obtained from simulated events [26], which are smeared such that they reproduce the 23 mass resolution measured in data.
The signal branching fractions are measured with
[TABLE]
where is the number of observed signal decays, is the number of normalisation-channel decays ( and ), is the corresponding branching fraction [15], and () is the total efficiency for the signal (normalisation) channel. The fraction indicates the probability for a quark to fragment into a meson. Assuming , the fragmentation probability for the and normalisation channel is set to . The value of in collision data at has been measured by LHCb to be [31]. The stability of at and is evaluated by comparing the observed variation of the ratio of the efficiency-corrected yields of and decays. The effect of increased collision energy is found to be negligible for data at while a scaling factor of is applied for data at .
The efficiency includes the detector acceptance, trigger, reconstruction and selection efficiencies of the final-state particles. The acceptance, reconstruction and selection efficiencies are computed with samples of simulated events whose decay-time distributions are generated according to the SM prediction. The tracking and particle identification efficiencies are determined using control channels in data [32, 33]. The trigger efficiencies are evaluated with data-driven techniques [34].
The numbers of and decays are and , respectively. The normalisation factors derived from the two channels are consistent. Taking correlations into account, their weighted averages are and . In the SM scenario, the analysed data sample is expected to contain an average of and decays in the full BDT range.
The combinatorial background is distributed almost uniformly over the mass range. In addition, the signal region and the low-mass sideband () are populated by backgrounds from exclusive -hadron decays, which can be classified in two categories. The first includes , , , and decays, where one or two hadrons are misidentified as a muon. The , and branching fractions are taken from Refs. [15, 35], while a theoretical estimate for is obtained from Refs. [36, 37]. The mass and BDT distributions of these decays are determined from simulated samples after calibrating the , and momentum-dependent misidentification probabilities using control channels in data. An independent estimate of the , and background yields is obtained by fitting the mass spectrum of or combinations selected in data, and rescaling the yields according to the or misidentification probability. The difference with respect to the results from the first method is assigned as a systematic uncertainty. The second category includes the decays , with , and , which have at least two muons in the final state. The rate of decays is evaluated from Refs. [38, 39], while those of decays are obtained from Refs. [40, 41]. The expected yields of all exclusive backgrounds are estimated using the decay as the normalisation channel, with the exception of the decays, which are normalised to the mode . The contributions from and decays [42, 4, 43] have a negligible impact on the signal yield determination. The expected background yields with in the signal region are , , and decays. The background is negligible. Except for the misidentified decays, which populate the signal region, the other modes are mostly concentrated in the low-mass sideband.
The Run 1 and Run 2 datasets are each divided into five subsets based on bins in the BDT variable with boundaries , , , , and . The and branching fractions are determined with a simultaneous unbinned maximum likelihood fit to the dimuon mass distribution in each BDT bin of the two datasets. The and fractional yields in each BDT bin and the parameters of the Crystal Ball functions that describe the shapes of the mass distributions are Gaussian-constrained according to their expected values and uncertainties. The combinatorial background in each BDT bin is parameterised with an exponential function, with a common slope parameter for all bins of a given dataset, while the yield is allowed to vary independently. The exclusive backgrounds are included as separate components in the fit. Their overall yields as well as the fractions in each BDT bin are Gaussian-constrained according to their expected values. Their mass shapes are determined from a simulation for each BDT bin.
The values of the and branching fractions obtained from the fit are and . The statistical uncertainty is derived by repeating the fit after fixing all the fit parameters, except the and branching fractions, the background yields and the slope of the combinatorial background, to their expected values. The systematic uncertainties of and are dominated by the uncertainty on and the knowledge of the exclusive backgrounds, respectively. The correlation between the two branching fractions is negligible. The mass distribution of the candidates with is shown in Fig. 1, together with the fit result [44].
An excess of candidates with respect to the expectation from background is observed with a significance of standard deviations (), while the significance of the signal is . The significances are determined, using Wilks’ theorem [45], from the difference in likelihood between fits with and without the signal component.
Since no significant signal is observed, an upper limit on the branching fraction is set using the method [46]. The ratio between the likelihoods in two hypotheses, signal plus background and background only, is used as the test statistic. The likelihoods are computed with nuisance parameters fixed to their nominal values. Pseudo-experiments are used for the evaluation of the test statistic in which the nuisance parameters are floated according to their uncertainties. The resulting upper limit on is at 95% confidence level.
The selection efficiency and BDT distribution of decays depend on the lifetime, which in turn depends on the model assumption entering Eq. 1. This introduces a further model-dependence in the measured time-integrated branching fraction. In the fit, the SM value is assumed, corresponding to . The model dependence is evaluated by repeating the fit under the and hypotheses, finding an increase of the branching fraction with respect to the SM assumption of 4.6% and 10.9%, respectively. The dependence is approximately linear in the physically allowed range.
For the lifetime determination, the data are background-subtracted with the sPlot technique [47], using a fit to the dimuon mass distribution to disentangle signal and background components statistically. Subsequently, a fit to the signal decay-time distribution is made with an exponential function multiplied by the acceptance function of the detector. The candidates are selected using criteria similar to those applied in the branching fraction analysis, the main differences being a reduced dimuon mass window, , and looser particle identification requirements on the muon candidates. The former change allows the fit model for the signal to be simplified by removing most of the and exclusive background decays that populate the lower dimuon mass region, while the latter increases the signal selection efficiency. Furthermore, instead of performing a fit in bins of BDT, a requirement of BDT is imposed. All these changes minimise the statistical uncertainty on the measured effective lifetime. This selection results in a final sample of 42 candidates.
The mass fit includes the and combinatorial background components. The parameterisations of the mass shapes are the same as used in the branching fraction analysis. The correlation between the mass and the reconstructed decay time of the selected candidates is less than 3%.
The variation of the trigger and selection efficiency with decay time is corrected for in the fit by introducing an acceptance function, determined from simulated signal events that are weighted to match the properties of the events seen in data. The use of simulated events to determine the decay-time acceptance function is validated by measuring the effective lifetime of decays selected in data. The measured effective lifetime is , where the uncertainty is statistical only, consistent with the world average [15]. The statistical uncertainty on the measured lifetime is taken as the systematic uncertainty associated with the use of simulated events to determine the acceptance function.
The accuracy of the fit for the effective lifetime is estimated using a large number of simulated experiments with signal and background contributions equal, on average, to those observed in the data. The contamination from , and semileptonic decays above 5320 is small and not included in the fit. The effect on the effective lifetime from the unequal production rate of and mesons [48] is negligible. A bias may also arise if , with the consequence that the underlying decay-time distribution is the sum of two exponential distributions with the lifetimes of the light and heavy mass eigenstates. In this case, as the selection efficiency varies with the decay time, the returned value of the lifetime from the fit is not exactly equal to the definition of the effective lifetime even if the decay-time acceptance function is correctly accounted for. This effect has been evaluated for the scenario where there are equal contributions from both eigenstates to the decay. The result can also be biased if the background has a much longer mean lifetime than decays; this is mitigated by an upper decay-time cut of 13.5. Any remaining bias is evaluated using the background decay-time distribution of the much larger data sample. All of these effects are found to be small compared to the statistical uncertainty and combine to give 0.05, with the main contributions arising from the fit accuracy and the decay-time acceptance (0.03 each).
The mass distribution of the selected candidates is shown in Fig. 2 (top). Figure 2 (bottom) shows the background-subtracted decay-time distribution with the fit function superimposed [44]. The fit results in , where the first uncertainty is statistical and the second systematic. This measurement is consistent with the hypothesis at the () level. Although the current experimental uncertainty allows only a weak constraint to be set on the value of the parameter in the physically allowed region, this result establishes the potential of the effective lifetime measurement in constraining New Physics scenarios with the datasets that LHCb is expected to collect in the coming years [49].
In summary, a search for the rare decays and is performed in collision data corresponding to a total integrated luminosity of 4.4. The signal is seen with a significance of standard deviations and provides the first observation of this decay from a single experiment. The time-integrated branching fraction is measured to be , under the hypothesis. This is the most precise measurement of this quantity to date. In addition, the first measurement of the effective lifetime, , is presented. No evidence for a signal is found, and the upper limit at 95% confidence level is set. The results are in agreement with the SM predictions and tighten the existing constraints on possible New Physics contributions to these decays.
Acknowledgements
We express our gratitude to our colleagues in the CERN accelerator departments for the excellent performance of the LHC. We thank the technical and administrative staff at the LHCb institutes. We acknowledge support from CERN and from the national agencies: CAPES, CNPq, FAPERJ and FINEP (Brazil); MOST and NSFC (China); CNRS/IN2P3 (France); BMBF, DFG and MPG (Germany); INFN (Italy); NWO (The Netherlands); MNiSW and NCN (Poland); MEN/IFA (Romania); MinES and FASO (Russia); MinECo (Spain); SNSF and SER (Switzerland); NASU (Ukraine); STFC (United Kingdom); NSF (USA). We acknowledge the computing resources that are provided by CERN, IN2P3 (France), KIT and DESY (Germany), INFN (Italy), SURF (The Netherlands), PIC (Spain), GridPP (United Kingdom), RRCKI and Yandex LLC (Russia), CSCS (Switzerland), IFIN-HH (Romania), CBPF (Brazil), PL-GRID (Poland) and OSC (USA). We are indebted to the communities behind the multiple open source software packages on which we depend. Individual groups or members have received support from AvH Foundation (Germany), EPLANET, Marie Skłodowska-Curie Actions and ERC (European Union), Conseil Général de Haute-Savoie, Labex ENIGMASS and OCEVU, Région Auvergne (France), RFBR and Yandex LLC (Russia), GVA, XuntaGal and GENCAT (Spain), Herchel Smith Fund, The Royal Society, Royal Commission for the Exhibition of 1851 and the Leverhulme Trust (United Kingdom).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] RBC-UKQCD collaborations, O. Witzel, B 𝐵 B -meson decay constants with domain-wall light quarks and nonperturbatively tuned relativistic b 𝑏 b -quarks , ar Xiv:1311.0276
- 2[2] HPQCD collaboration, H. Na et al. , B 𝐵 B and B s subscript 𝐵 𝑠 B_{s} meson decay constants from lattice QCD , Phys. Rev. D 86 (2012) 034506 , ar Xiv:1202.4914 · doi ↗
- 3[3] Fermilab Lattice and MILC collaborations, A. Bazavov et al. , B- and D-meson decay constants from three-flavor lattice QCD , Phys. Rev. D 85 (2012) 114506 , ar Xiv:1112.3051 · doi ↗
- 4[4] C. Bobeth et al. , B s , d → l + l − → subscript 𝐵 𝑠 𝑑 superscript 𝑙 superscript 𝑙 B_{s,d}\rightarrow l^{+}l^{-} in the Standard Model with reduced theoretical uncertainty , Phys. Rev. Lett. 112 (2014) 101801 , ar Xiv:1311.0903 · doi ↗
- 5[5] K. S. Babu and C. F. Kolda, Higgs mediated B 0 → μ + μ − → superscript 𝐵 0 superscript 𝜇 superscript 𝜇 B^{0}\rightarrow\mu^{+}\mu^{-} in minimal supersymmetry , Phys. Rev. Lett. 84 (2000) 228 , ar Xiv:hep-ph/9909476 · doi ↗
- 6[6] G. Isidori and A. Retico, Scalar flavor changing neutral currents in the large tan beta limit , JHEP 11 (2001) 001 , ar Xiv:hep-ph/0110121 · doi ↗
- 7[7] A. J. Buras, P. H. Chankowski, J. Rosiek, and L. Slawianowska, Correlation between Δ M s Δ subscript 𝑀 𝑠 \Delta M_{s} and B s , d 0 → μ + μ − → subscript superscript 𝐵 0 𝑠 𝑑 superscript 𝜇 superscript 𝜇 B^{0}_{s,d}\rightarrow\mu^{+}\mu^{-} in supersymmetry at large tan β 𝛽 \tan\beta , Phys. Lett. B 546 (2002) 96 , ar Xiv:hep-ph/0207241 · doi ↗
- 8[8] CMS and LH Cb collaborations, V. Khachatryan et al. , Observation of the rare B s 0 → μ + μ − → superscript subscript 𝐵 𝑠 0 superscript 𝜇 superscript 𝜇 B_{s}^{0}\rightarrow\mu^{+}\mu^{-} decay from the combined analysis of CMS and LH Cb data , Nature 522 (2015) 68 , ar Xiv:1411.4413 · doi ↗
