Study of $t\bar{t}H$ production with $H\rightarrow b\bar{b}$ at the HL-LHC
A. J. Costa, A. L. Carvalho, R. Gon\c{c}alo, P. Mui\~no, A. Onofre

TL;DR
This study proposes a jet substructure technique for detecting $t\bar{t}H$ production with $H\rightarrow b\bar{b}$ at the HL-LHC, demonstrating potential for observing the process and measuring the top Yukawa coupling with high significance.
Contribution
It introduces a novel cut-based analysis focusing on boosted Higgs reconstruction using jet substructure, improving sensitivity over existing strategies.
Findings
$t\bar{t}H$ can be observed with over 5 sigma significance at 300 fb$^{-1}$.
The top Yukawa coupling can be measured with 17-35 ext{ extbackslash}% uncertainty.
Re-clustered jets can be used without loss of efficiency.
Abstract
A feasibility study for an experimental analysis searching for production at the LHC and its high luminosity phase is presented in this note. Unlike search strategies currently being used in experimental collaborations, the present analysis exploits jet substructure techniques and focuses on the reconstruction of boosted Higgs bosons, to obtain sensitivity to the signal in a simple cut-based analysis. The jets background may be constrained in the proposed analysis through a control region with very small signal contamination. Using this analysis strategy, the process could be observed at the LHC, in the semi-leptonic channel alone, with a significance of for . For the same integrated luminosity, in the High Luminosity LHC scenario with an upgraded detector,…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6| Scenario | -tag | -tag | mistag prob |
|---|---|---|---|
| LHC | |||
| HL-LHC |
| () | () | |
|---|---|---|
| 36 | 2.12 0.04 | 15.7 0.4 |
| 300 | 6.13 0.11 | 15.7 0.4 |
| 3000 | 19.39 0.33 | 15.7 0.4 |
| Scenario | () | () | |
|---|---|---|---|
| LHC | 36 | 1.88 0.04 | 11.6 0.4 |
| HL-LHC | 36 | 2.12 0.04 | 15.7 0.4 |
| LHC | 300 | 5.41 0.12 | 11.6 0.4 |
| HL-LHC | 300 | 6.13 0.11 | 15.7 0.4 |
| LHC | 3000 | 17.12 0.38 | 11.6 0.4 |
| HL-LHC | 3000 | 19.39 0.33 | 15.7 0.4 |
| Scenario | () | () |
|---|---|---|
| LHC | 300 | 35 |
| HL-LHC | 3000 | 17 |
| Scenario | () | Signal strength () |
|---|---|---|
| LHC | 300 | 0.99 0.18 |
| HL-LHC | 3000 | 1.00 0.05 |
| Strategy | () | () | |
|---|---|---|---|
| 36 | 2.12 0.04 | 15.7 0.4 | |
| 36 | 1.63 0.03 | 12.1 0.3 | |
| 300 | 6.13 0.11 | 15.7 0.4 | |
| 300 | 4.71 0.10 | 12.1 0.3 | |
| 3000 | 19.39 0.33 | 15.7 0.4 | |
| 3000 | 14.90 0.32 | 12.1 0.3 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle physics theoretical and experimental studies · Superconducting Materials and Applications · Distributed and Parallel Computing Systems
Study of production with at the HL-LHC
A. J. Costa
School of Physics and Astronomy, University of Birmingham, Edgbaston Park Rd, Birmingham B15 2TT, United Kingdom
A. L. Carvalho
LIP, Av. Prof. Gama Pinto, 2, 1649-003 Lisboa, Portugal
R. Gonçalo
LIP, Av. Prof. Gama Pinto, 2, 1649-003 Lisboa, Portugal
Faculdade de Ciências da Universidade de Lisboa, Campo Grande 016, 1749-016 Lisboa, Portugal
P. Muiño
LIP, Av. Prof. Gama Pinto, 2, 1649-003 Lisboa, Portugal
Departamento de Física, Instituto Superior Técnico – IST, Universidade de Lisboa – UL, Avenida Rovisco Pais 1, 1049 Lisboa, Portugal
A. Onofre
LIP, Departamento de Física, Universidade do Minho, 4710-057 Braga, Portugal
Abstract
A feasibility study for an experimental analysis searching for production at the LHC and its high luminosity phase is presented in this note. Unlike search strategies currently being used in experimental collaborations, the present analysis exploits jet substructure techniques and focuses on the reconstruction of boosted Higgs bosons, to obtain sensitivity to the signal in a simple cut-based analysis. The jets background may be constrained in the proposed analysis through a control region with very small signal contamination. Using this analysis strategy, the process could be observed at the LHC, in the semi-leptonic channel alone, with a significance of for . For the same integrated luminosity, in the High Luminosity LHC scenario with an upgraded detector, a significance of may be obtained. The top Yukawa coupling could be measured with a 35% uncertainty using of LHC data and of 17% at the HL-LHC scenario with . In the same luminosity scenarios, the signal strength is equally expected to have a 18 and 5 uncertainty, respectively. Finally, it was found that re-clustered jets may be used without loss of efficiency.
I Introduction
An impressive amount of work has been devoted to measuring the Higgs boson properties since its discovery in 2012, by the ATLAS and CMS experiments Collaboration (2012a, b) at the LHC. The interaction of the Higgs boson with fermions is one of its most important features, as it is responsible for their masses. Of these, the most experimentally accessible Yukawa couplings are to third generation quarks and leptons.
Both ATLAS and CMS have recently observed the coupling of the Higgs boson to top et al. (ATLAS Collaboration, CMS Collaboration) and bottom et al. (ATLAS Collaboration, CMS Collaboration) quarks, and to tau leptons et al. (ATLAS Collaboration, CMS Collaboration). Of these, the top quark Yukawa coupling is of particular relevance, due to the large top quark mass, and might reveal a window into the physics beyond the Standard Model (SM).
The production of the Higgs boson in association with two top quarks, , is especially important, as it provides direct experimental access to the vertex at Born level. However, this process contributes only around of the total Higgs boson production cross-section, due to the large invariant mass of the final state objects. Nevertheless, different Higgs decay modes are accessible in this process, and the decay of the Higgs boson into two bottom () quarks poses an interesting scenario, as this decay is associated to the largest branching ratio of the Higgs particle () Grojean (2017), and contributes to a distinctive experimental signature.
In the present work, we have considered the final state where one top decays hadronically and the other semi-leptonically, and the Higgs boson decays to a -quark pair (). Despite the distinct final state, this channel is dominated by large systematic uncertainties, related to a poor understanding of the production of pairs in association with heavy-flavour quarks (bottom or charm). This leads to a reasonably poor experimental sensitivity (see e.g. Collaboration (2018)).
The present paper proposes an alternative strategy for experimental searches at the High-Luminosity LHC (HL-LHC), based on a strategy proposed for the Future Circular Collider M. L. Mangano et al. (2016). It relies on the reconstruction of Higgs bosons with high transverse momentum (), so that its decay products are confined in a large radius jet. Hadronic jet substructure information is used to further discriminate between signal and backgrounds.
The high luminosity phase of the LHC Collaboration (2015); G. Apollinari et al. (2017) is expected to start operation in 2026 and run for ten years, collecting up to of proton collisions at a center-of-mass energy of 14 TeV. The number of expected collisions per bunch crossing, or pileup, will be up to 200, much higher than at present.
II Simulation
In addition to the signal and its main irreducible background, this paper also considered other relevant background processes, namely , , where corresponds to additional jets, , and QCD di-jet production. An alternative signal sample was also generated with the HC_UFO_v4.1 model et al. (2013), where is a pure pseudo-scalar boson instead of the SM scalar Higgs.
Events for these processes were generated at Leading Order (LO), and for a center-of-mass energy of 14 TeV. The MG5_aMC@NLO generator et al. (2014) and the LO NN23LO1 PDF were used for all samples except for the di-jet sample, which was generated with Pythia8.2 T. Sjostrand and Skands (2008) using the LO CTEQ 5L PDF. MadSpin P. Artoisenet and Rietkerk (2013) was used in the generation of all MG5_aMC@NLO samples, to preserve spin information in particle decays.
At generator level, cuts were applied to enhance the generation efficiency, and their effect on the analysis outcome was verified to be negligible. Leptons and quarks were required to have a minimum transverse momentum of in all samples with the exception of the di-jet sample (where a -quark transverse momemtum cut was applied (), and the sample ( GeV).
Non--initiated jets were required to have transverse momenta GeV in the , and processes. On the other hand, jets were required to satisfy in the sample and in the generation of the sample. The minimum cut was set to for the di-jet sample. Furthermore, a minimum angular separation between pairs of jets and leptons was required, with .
All simulated events are hadronized using Pythia8.2, and Delphes3.2 de Favereau et al. (DELPHES 3 Collaboration) is used for the fast simulation of the collider experiments. The ATLAS default card was considered for simulating the LHC scenario, while the HL-LHC card was used for higher luminosity scenarios.
Finally, in Delphes, simulated leptons are required to have a minimum transverse momentum of , and an isolation variable below 0.1 within , meaning that the of a jet around the lepton must be less than of the lepton , in order to consider it an isolated lepton.
III Tagging
The identification of -quark initiated jets, or -tagging, was emulated by searching for a quark within of each jet. In the LHC (HL-LHC) scenarios, a quark was found, the jet was considered -tagged with a probability of 61% (65%). Otherwise, a quark was sought for and, if found, a 4.5% (3%) probability was assigned for mis-tagging this jet as a -jet. Finally, a 0.08% (0.07%) was assigned to mis-tagging jets initiated by light quarks or gluons. These working points were determined from existing literature for the HL-LHC et al. (ATLAS Collaboration) and LHC scenarios, and are summarised in Table 1.
-tagged jets are required to have . Improvements in the -tagging in the HL-LHC scenario are only expected within this range, with performance improvements beyond this range still under optimization and uncertain.
IV Event Selection
The analysis proposed in the present study corresponds to adapting and optimizing for the HL-LHC the strategy proposed in Ref. M. L. Mangano et al. (2016). The main differences are explained below.
Selected events are required to have an isolated charged lepton ( or ), with and . To avoid the need for unfeasibly large samples, the isolated charged lepton is not required for the and di-jet backgrounds. Instead, one in jets is identified as a lepton, to emulate fake lepton identification.
The calorimeter towers in the simulated event are then collected and the ones within a of an isolated electron are removed to avoid double-counting energy deposits. Muon energy deposits in the calorimeter are considered negligible. The remaining towers form the ’tower collection’ and are used as input to jet clustering, done with Fastjet. The Cambridge-Aachen (C/A) Y. L. Dokshitzer and Webber (1997) algorithm is used to reconstruct jets with a radius and . One or more of these jets are required.
The C/A jets are used to search for Higgs boson candidates using the BDRS tagger J. M. Butterworth and Salam (2008). This algorithm attempts to identify jets containing two sub-jets and a significant invariant mass. The algorithm parameters were a mass drop condition of 0.9 and . If the algorithm identifies a Higgs candidate jet, its two subjets are required to be -tagged, and have .
The candidates that pass these selection criteria are then filtered J. M. Butterworth and Salam (2008) to remove eventual pile-up and underlying event contamination. Up to three hard thinner subjets are kept, to account for gluon radiation of one of the quarks. After this procedure candidate jets are required to have .
Higgs candidate jets with a between the two BDRS -tagged sub-jets () below 0.3 are rejected, in order to suppress wrongly identified Higgs candidates. As a side-effect, the low- events provide a useful side-band at low jet masses.
In events with more than one Higgs jet candidate (around of events) the jet with highest is chosen. The event is then required to have one Higgs candidate, and its associated towers are removed from the tower collection to avoid energy double counting in subsequent steps.
The remaining towers are clustered in anti jets, which are required to have . Two -tagged jets are then required, with between them.
The between the leading and sub-leading -tagged jets, and the Higgs candidate jet are computed and referred to as and , respectively. Events are then required to satisfy , and , to suppress backgrounds.
The main changes with respect to the original analysis M. L. Mangano et al. (2016) are that no use is made of the HEPTopTagger2 T. Plehn and Spannowsky (2010) algorithm to tag hadronically decaying top quarks, since this was found to suppress the signal efficiency in the kinematic regime of the HL-LHC, and also in the C/A jet radius and jet cuts. Comparing both strategies when applied to HL-LHC simulated events, the proposed analysis corresponds to a factor improvement in the analysis significance.
The significance and signal to background ratio was determined for Higgs candidate jets with mass, , between 60 and 160 . Moreover, the significance is computed using . The mass distribution of the Higgs candidate jets, for the HL-LHC scenario and SM samples, is shown in Figure 1, for integrated luminosity of 3000 .
The estimated significance and , estimated in the mass window between 60 and 160 GeV, are shown in Table 2 for different integrated luminosities.
V Jet Re-Clustering
Jet re-clustering corresponds to using standard, anti- jets as input to jet clustering algorithms. In addition to good noise suppression characteristics, a further practical advantage of using re-clustered jets is to avoid maintaining many dedicated calibrations for each combination of jet algorithm parameters.
To study the effect of jet re-clustering, we used anti jets as input to the C/A jet reconstruction algorithm, with , and applied a cut to the resulting jets, before using them as input to the BDRS Higgs tagger as in the analysis described above.
Apart from statistical fluctuations, no significant differences were found betwen the analyses with tower jets or re-clustered jets, neither in the shape of the invariant mass distribution nor in the significance.
VI Control Region
A control region is proposed, which may be used to constrain the background normalization in the signal region. This control region is defined by an event selection which is identical to the signal region, except that the two Higgs candidate sub-jets are anti--tagged. The probabilities associated to requiring two -tags on the two subjets of the Higgs candidate jet, retrieved by the BDRS Higgs tagger, are complementary of the working point used in the signal region. This results in the invariant mass distribution shown in Figure 2 for the HL-LHC scenario. As expected, this region is dominated by the background, with signal accounting for only 0.5% of the event yield in the mass region.
VII LHC Scenario
The same analysis selection optimized for the HL-LHC scenario, i.e. using the HL-LHC simulation and assuming of integrated luminosity, was then applied to the LHC scenario, where the ATLAS detector (fast simulation) and or were assumed. The ATLAS fast simulation model approximates the current ATLAS detector. Notable differences with respect to the HL-LHC simulation are a slightly less performant -tagging and a 2 T magnetic field (instead of a 3 T field in the HL-LHC case de Favereau et al. (DELPHES 3 Collaboration)). The mass distribution obtained for the LHC scenario is presented on Figure 3 for an integrated luminosity of 300 .
The significance and of the analysis (optimized for the HL-LHC and applied to both scenarios) is shown for different integrated luminosities in Table 3. As before, these variables are computed in the mass window between 60 and 160 GeV.
The significance and decrease slightly in the LHC scenario, mainly because of the less performant -tagging. The results in the table indicate that could be observed already by the end of the LHC programme with an integrated luminosity of and a significance of .
VIII Top Yukawa Coupling Uncertainty
The expected precision of a top Yukawa coupling () determination at the LHC and the HL-LHC was estimated from the uncertainty in the number of signal events. Values are shown in Table 4. The cross section is proportional to the top Yukawa coupling squared, , where includes all the factors associated to a cross section computation. In this estimate and are considered not to have associated errors.
IX Signal Strength
The signal strength is obtained by minimizing et al. (Particle Data Group), defined as
[TABLE]
where is the number of bins in the distribution, is the signal strength and is the corresponding best estimator, is the likelihood estimator being maximized and is the unconditional maximum likelihood estimator. Finally, and are the expected number of signal and background events, and is the number of events in the simulated distribution.
The distributions of for the LHC and HL-LHC scenarios, with an integrated luminosity of 300 fb*-1* and , respectively, are presented in Figure 4. As expected due to the larger integrated luminosity, it can be seen that the error on the signal strength decreases in the HL-LHC scenario.
The values for the signal strengths are shown in Table 5. An uncertainty on the signal strength of 18% is expected in the LHC scenario, while this error decreases to in the HL-LHC scenario.
X Search for a pure pseudo-scalar boson
The production of a pseudo-scalar in association with two top quarks was also considered in this work. Although it has been excluded that the observed Higgs is a pure pseudo-scalar, fermion vertices may yet uncover the presence of a pseudo-scalar component. The production cross section for is about a half of the one, and decays were considered. The mass distribution of the Higgs candidate jets for the signal sample and SM backgrounds, in the HL-LHC scenario, is shown in Figure 5. The and distributions have similar shapes, differing only in the number of events due to the different cross sections.
The significance and for production were computed for different integrated luminosities, and are shown in Table 6. These variables are computed, as before, in the mass window between 60 and 160 GeV.
The estimated significance and are lower for production. Observation of production would require at least in the HL-LHC scenario, with an expected significance of .
XI Conclusions
An analysis strategy for the semileptonic channel is proposed in this paper. It relies on the reconstruction of boosted Higgs bosons using large radius jets and jet substructure information to identify the objects of interest and suppress backgrounds.
It improves the analysis significance by a factor 3 with respect to Reference M. L. Mangano et al. (2016) (which was optimized for the Future Circular Collider). Moreover, it was observed that the re-clustering technique may be used without affecting the results. Finally, a control region is proposed, kinematically close to the signal region, although orthogonal to it through the use of anti--tagging.
Results indicate that could be observed at the LHC with an integrated luminosity of 300 fb*-1* in the LHC scenario, using the optimized strategy, with a significance of .
The top Yukawa coupling extracted from the proposed analysis is expected to have a 35% uncertainty by the end of the LHC programme, considering an integrated luminosity of . This uncertainty decreases to 17 in the HL-LHC scenario with an integrated luminosity of .
A multivariate method (MVA) could further discriminate between the signal and the backgrounds, exploiting correlations between discriminating variables such as and ratios for the Higgs candidate jets. Finally, it should be noted that this paper does not consider the effects of pile-up, neither uses a full simulation of the detector. The analysis sensitivity is expected to decrease when introducing these realistic effects, but continue to be competitive in terms of significance.
Acknowledgements
The authors would like to thank our colleagues and friends Liliana Apolinário, Pedro Abreu, João Martins, Aidan Kelly, and Silvia Biondi, for various discussions, support, and good advice, as well as for part of the and di-jet samples used. This work was partially supported by Fundação para a Ciência e Tecnologia, FCT (Projects No. CERN/FIS-PAR/0008/2017).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Collaboration (2012 a) A. Collaboration, Physics Letters B 716 , 1–29 (2012 a), ar Xiv : : : 1207.7214 [hep-ex].
- 2Collaboration (2012 b) C. Collaboration, Physics Letters B 716 , 30 (2012 b), ar Xiv : : : 1207.7235 [hep-ex].
- 3et al. (ATLAS Collaboration) M. A. et al. (ATLAS Collaboration), Phys. Lett. B 784 , 173 (2018 a), ar Xiv:1806.00425 v 1 [hep-ex].
- 4et al. (CMS Collaboration) A. S. et al. (CMS Collaboration), Phys. Rev. Lett. 120 , 231801 (2018 a), ar Xiv:1804.02610 v 2 [hep-ex].
- 5et al. (ATLAS Collaboration) M. A. et al. (ATLAS Collaboration), (2018 b), ar Xiv:1808.08238 v 1 [hep-ex], CERN-EP-2018-215.
- 6et al. (CMS Collaboration) A. S. et al. (CMS Collaboration), (2018 b), ar Xiv:1808.08242 v 1 [hep-ex], CMS-PAS-HIG-18-016, CERN-EP-2018-223.
- 7et al. (ATLAS Collaboration) M. A. et al. (ATLAS Collaboration), (2018 c), a TLAS-CONF-2018-021.
- 8et al. (CMS Collaboration) A. S. et al. (CMS Collaboration), Phys. Lett. B 779 , 283 (2018 c), ar Xiv:1708.00373 v 2 [hep-ex].
