Properties of jet fragmentation using charged particles measured with   the ATLAS detector in $pp$ collisions at $\sqrt{s}=13$ TeV

ATLAS Collaboration

arXiv:1906.09254·hep-ex·December 3, 2019

Properties of jet fragmentation using charged particles measured with the ATLAS detector in $pp$ collisions at $\sqrt{s}=13$ TeV

ATLAS Collaboration

PDF

TL;DR

This paper measures jet fragmentation properties in proton-proton collisions at 13 TeV using charged particles with the ATLAS detector, comparing results with models and exploring quark/gluon jet differences.

Contribution

It provides the first measurement of charged-particle multiplicity using model-independent jet labels and compares fragmentation data with Monte Carlo simulations across a wide phase space.

Findings

01

Models describe quark-like jets well but underestimate charged particles in gluon-like jets.

02

Data show significant differences from simulations, especially for gluon-initiated jets.

03

First use of topic modeling for jet flavor classification in fragmentation studies.

Abstract

This paper presents a measurement of quantities related to the formation of jets from high-energy quarks and gluons (fragmentation). Jets with transverse momentum 100 GeV $< p_{T} <$ 2.5 TeV and pseudorapidity $∣ η ∣ < 2.1$ from an integrated luminosity of 33 fb $^{- 1}$ of $s = 13$ TeV proton-proton collisions are reconstructed with the ATLAS detector at the Large Hadron Collider. Charged-particle tracks with $p_{T} > 500$ MeV and $∣ η ∣ < 2.5$ are used to probe the detailed structure of the jet. The fragmentation properties of the more forward and the more central of the two leading jets from each event are studied. The data are unfolded to correct for detector resolution and acceptance effects. Comparisons with parton shower Monte Carlo generators indicate that existing models provide a reasonable description of the data across a wide range of phase space, but there are also…

Tables1

Table 1. Table 1: A summary of the object and event selection criteria at particle level and detector level.

Particle level

Detector level

Pileup

–

Identify primary vertex

Jet algorithm

Anti-

k_{t}

,

R = 0.4

Jet requirements

| η | < 2.1

Jet constituents

Particles with

c ​ τ > 10

mm prior to

detector interactions excluding

μ

and

ν

Calorimeter energy clusters

Measurement inputs

Charged jet constituents,

p_{T} > 500

MeV and

| η | < 2.5

Ghost-associated tracks,

p_{T} > 500

MeV and

| η | < 2.5

Event selection

At least two jets, with the leading two satisfying

p_{T}^{lead} / p_{T}^{sublead} < 1.5

Jet selection

Leading two, separated by

η

(more forward/central)

Equations18

μ \frac{\partial}{\partial μ} D_{p}^{h} (ζ, μ) = p^{'} \sum \int_{ζ}^{1} \frac{d ζ ^{'}}{ζ ^{'}} \frac{α _{S} ( μ ) P _{p^{'} \leftarrow p} ( ζ ^{'} , μ )}{π} D_{p^{'}}^{h} (\frac{ζ}{ζ ^{'}}, μ),

μ \frac{\partial}{\partial μ} D_{p}^{h} (ζ, μ) = p^{'} \sum \int_{ζ}^{1} \frac{d ζ ^{'}}{ζ ^{'}} \frac{α _{S} ( μ ) P _{p^{'} \leftarrow p} ( ζ ^{'} , μ )}{π} D_{p^{'}}^{h} (\frac{ζ}{ζ ^{'}}, μ),

⟨ n_{ch} ⟩ (p_{T}^{jet}) = p \sum f_{p} (p_{T}^{jet}) h charged \sum \int_{threshold / p_{T}^{jet}}^{1} d ζ D_{p}^{h} (ζ, p_{T}^{jet}),

⟨ n_{ch} ⟩ (p_{T}^{jet}) = p \sum f_{p} (p_{T}^{jet}) h charged \sum \int_{threshold / p_{T}^{jet}}^{1} d ζ D_{p}^{h} (ζ, p_{T}^{jet}),

F (ζ, p_{T}^{jet}) = p \sum f_{p} (p_{T}^{jet}) h charged \sum D_{p}^{h} (ζ, p_{T}^{jet}) .

F (ζ, p_{T}^{jet}) = p \sum f_{p} (p_{T}^{jet}) h charged \sum D_{p}^{h} (ζ, p_{T}^{jet}) .

x_{unfolded, i} = \frac{1}{n _{jets, unfolded}} j = 1 \sum N_{total} θ_{ij} x_{detected, j} (\frac{1 - ϵ _{reco not true, j}}{1 - ϵ _{true not reco, i}}),

x_{unfolded, i} = \frac{1}{n _{jets, unfolded}} j = 1 \sum N_{total} θ_{ij} x_{detected, j} (\frac{1 - ϵ _{reco not true, j}}{1 - ϵ _{true not reco, i}}),

⟨ x^{κ} ⟩_{unfolded} (p_{T} bin j)

⟨ x^{κ} ⟩_{unfolded} (p_{T} bin j)

h_{i}^{f}

h_{i}^{f}

h_{i}^{c}

h_{i}^{T_{1}}

h_{i}^{T_{1}}

h_{i}^{T_{2}}

⟨ i \in jet \sum ζ_{i}^{κ} ⟩_{gluons} (p_{T}) \mbox \vbox \sim \vbox \propto lo g (p_{T}^{2} / Λ^{2})^{2 P_{g \leftarrow g} (κ) / β_{0}},

⟨ i \in jet \sum ζ_{i}^{κ} ⟩_{gluons} (p_{T}) \mbox \vbox \sim \vbox \propto lo g (p_{T}^{2} / Λ^{2})^{2 P_{g \leftarrow g} (κ) / β_{0}},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\PreprintIdNumber

CERN-EP-2019-090 \AtlasJournalRefPhys. Rev. D 100 (2019) 052011 \AtlasDOI10.1103/PhysRevD.100.052011 \AtlasTitleProperties of jet fragmentation using charged particles measured with the ATLAS detector in $pp$ collisions at $\sqrt{s}=13$ TeV \AtlasAbstract This paper presents a measurement of quantities related to the formation of jets from high-energy quarks and gluons (fragmentation). Jets with transverse momentum $100$ GeV $<p_{\text{T}}<2.5$ TeV and pseudorapidity $|\eta|<2.1$ from an integrated luminosity of 33 fb*-1* of $\sqrt{s}=13$ TeV proton–proton collisions are reconstructed with the ATLAS detector at the Large Hadron Collider. Charged-particle tracks with $p_{\text{T}}>500$ MeV and $|\eta|<2.5$ are used to probe the detailed structure of the jet. The fragmentation properties of the more forward and the more central of the two leading jets from each event are studied. The data are unfolded to correct for detector resolution and acceptance effects. Comparisons with parton shower Monte Carlo generators indicate that existing models provide a reasonable description of the data across a wide range of phase space, but there are also significant differences. Furthermore, the data are interpreted in the context of quark- and gluon-initiated jets by exploiting the rapidity dependence of the jet flavor fraction. A first measurement of the charged-particle multiplicity using model-independent jet labels (topic modeling) provides a promising alternative to traditional quark and gluon extractions using input from simulation. The simulations provide a reasonable description of the quark-like data across the jet $p_{\text{T}}$ range presented in this measurement, but the gluon-like data have systematically fewer charged particles than the simulation

1 Introduction

Jets are collimated sprays of particles resulting from high-energy quark and gluon production. The details of the process that underlies the fragmentation of quarks and gluons with net quantum chromodynamic (QCD) charge into net neutral hadrons is not fully understood. Jet formation is a complex multi-scale problem, including important contributions from QCD effects that cannot be described by perturbation theory. Measuring basic quantities related to fragmentation is therefore essential to furthering our understanding of the emergent properties of QCD.

Perturbative and non-perturbative physically inspired models have free parameters that are tuned to data in order to best describe the radiation pattern inside jets [1]. This is in turn an important input to all analyses at the Large Hadron Collider (LHC) due to the ubiquity of jets. Measurements of jet substructure in proton–proton ( $pp$ ) collisions at a center-of-mass energy of $\sqrt{s}=7$ TeV [2, 3, 4, 5] have already been used by the ATLAS Collaboration for parameter optimizations (tunes) of the Pythia 8 Monte Carlo (MC) generator [6]. A measurement of the average number of charged particles inside jets at $\sqrt{s}=8$ TeV [7] was also used as input to recent developments in the Herwig 7 MC program [8]. Further measurements of jet constituent multiplicity and energy sharing will provide powerful constraints for future generator optimizations.

Quark- and gluon-initiated jets (henceforth quark and gluon jets) have different radiation patterns (see e.g. Ref. [9]). As many analyses at the LHC target either quark-enriched or gluon-enriched processes, these radiation-pattern differences can be useful for jet tagging [10, 11]. Measurements of jet structure can be used to calibrate quark-versus-gluon jet taggers. By exploiting the rapidity dependence of the relative quark and gluon jet rates, ATLAS [7] extracted the average charged-particle multiplicity for quark and gluon jets separately. This was then combined with detector-level systematic uncertainties to provide quark/gluon tagger uncertainties at $\sqrt{s}=13$ TeV [12]. A more complex tagger based on several jet shapes could be calibrated in a similar manner using extended results. The benefit of a particle-level measurement is that a portion of the calibration can be independent of ATLAS and LHC operating conditions. Uncertainties in detector effects can be updated with the changing detector environment. Adding more observables and measuring their differential distributions will improve this calibration.

Although the full radiation pattern inside jets is not calculable from first principles, the energy dependence of many observables can be calculated in perturbation theory. There have been significant theoretical advances in soft-collinear effective theory (SCET) [13, 14, 15, 16] to derive factorization theorems that describe the evolution of universal non-perturbative functions [17, 18, 19, 20]. This was applied to the measurement of jet charge at $\sqrt{s}=8$ TeV [21]. There have also been predictions and comparisons with the jet transverse momentum ( $p_{\text{T}}$ ) dependence of the average number of charged particles inside jets (see Ref. [7] and references therein). This quantity does not have a perturbative expansion in the usual sense (as a series in $\alpha_{\text{S}}$ ); instead there is a series expansion in $\sqrt{\alpha_{\text{S}}}$ [22, 23]. This behavior is predicted for a wide class of Sudakov safe observables [24]. At least for the case of charged-particle multiplicity, this non-standard expansion seems to be an excellent model of the data [7].

The goal of this paper is to measure properties of jet fragmentation using charged-particle tracks. Such properties have been measured at many colliders at various center-of-mass energies, including the SPS [25, 26, 27], PETRA [28, 29], PEP [30, 31, 32, 33], TRISTAN [34], CESR [35], LEP [36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47], HERA [48, 49], and the Tevatron [50, 51, 52, 53]. Previous measurements by the ATLAS and CMS Collaborations were performed at $\sqrt{s}=2.76$ TeV [54, 55], $\sqrt{s}=5.02$ TeV [56, 57, 58], $\sqrt{s}=7$ TeV [59, 2, 60] and $\sqrt{s}=8$ TeV [7, 21, 61] in $pp$ collisions and are also compared with jet fragmentation measured in Pb+Pb collisions [54, 62, 54, 56, 57, 58, 63] and p+Pb collisions [56]. The measurement presented here represents a significant extension of previous work. In particular, the accessible jet energy range is increased due to the larger $\sqrt{s}=13$ TeV. There are enough events in the 2016 dataset to probe the substructure of jets with $p_{\text{T}}$ up to 2.5 TeV. Next, the precision of the measurement has improved due to advances in track reconstruction inside jets during the long shutdown between LHC Runs 1 and 2, including the additional insertable B-layer (IBL) detector [64, 65] and new algorithms for tracking inside dense environments [66, 67, 68]. Furthermore, detailed experimental studies to derive uncertainties in all aspects of tracking inside jets extend the capabilities of previous measurements to a wider region of phase space and also allow differential analyses [67, 69]. These new data therefore probe broader and deeper aspects of the radiation pattern inside jets across an extended phase space.

The paper is organized as follows. Section 2 introduces the observables to be measured. Then, following a brief description of the ATLAS detector in Section 3, the data and simulation samples are documented in Section 4. Charged-particle track, jet, and event reconstruction are detailed in Section 5. Corrections for detector effects (unfolding) are documented in Section 6. A description of the corresponding systematic uncertainties can be found in Section 7 and the results are presented in Section 8. Section 9 provides conclusions and future outlook.

2 Observables

This analysis builds upon the previous ATLAS jet structure measurements presented in Refs. [59, 7, 21]. The fundamental quantity is the fragmentation function $D_{p}^{h}(z,E)$ , which describes the probability of finding a hadron $h$ with energy fraction $z$ of the parton $p$ that has energy $E$ . At a hadron collider, the jet transverse momentum, $p_{\text{T}}$ ,111ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the center of the detector and the $z$ -axis along the beam pipe. The $x$ -axis points from the IP to the center of the LHC ring, and the $y$ -axis points upwards. Cylindrical coordinates $(r,\phi)$ are used in the transverse plane, $\phi$ being the azimuthal angle around the $z$ -axis. The pseudorapidity is defined in terms of the polar angle $\theta$ as $\eta=-\ln\tan(\theta/2)$ . Angular distance is measured in units of $\Delta R\equiv\sqrt{(\Delta\eta)^{2}+(\Delta\phi)^{2}}$ . is a better proxy for the starting scale ( $\mu$ ) of jet evolution. To avoid confusion with previous measurements of similar observables, the transverse momentum fraction is denoted in this paper by the symbol $\zeta=p_{\text{T}}^{\text{particle}}/p_{\text{T}}^{\text{jet}}$ . The fragmentation function itself, like parton distribution functions (PDFs), cannot be calculated from first principles in perturbation theory. However, it has a DGLAP [70, 71, 72] evolution and so the $p_{\text{T}}$ dependence of many observables can be calculated. In particular:

[TABLE]

where $P_{p^{\prime}\leftarrow p}(\zeta,\mu)$ are the Altarelli–Parisi splitting functions [70] and depend on the scale $\mu$ through $\alpha_{\text{S}}$ . Charged particles are studied because they provide a way to measure single hadrons inside the jet (as opposed to calorimeter energy deposits, which can result from multiple particles) which gives access to $\sum_{h}D_{p}^{h}$ . A basic quantity related to the fragmentation function is the charged-particle multiplicity. The average charged-particle multiplicity is an integral over $\zeta$ and a sum over $h$ and $p$ of $D_{p}^{h}(\zeta)$ . An extension of the multiplicity is the set of $\zeta$ moments of $D$ . The zeroth moment is the average multiplicity. The full distribution of multiplicity depends on (multi-hadron) fragmentation functions in a complicated way; a more direct probe of $D$ is to measure hadron production as a function of $\zeta$ , which is a sum of $D$ over $p$ and $h$ (but no integral over $\zeta$ ). Additional observables are also studied in order to probe the angular spread of jet fragmentation beyond the collinear limit. All of the observables are described below.

Charged-particle multiplicity ( $n_{\text{ch}}$ ): The number of charged particles inside a jet with $p_{\text{T}}$ above some threshold. In terms of the fragmentation function:

[TABLE]

where $f_{p}$ is the fraction of parton type $p$ at a given jet $p_{\text{T}}$ . The multiplicity is not calculable in perturbation theory, but to lowest order in $\sqrt{\alpha_{\text{S}}}$ , the ratio of the multiplicity for quark-initiated jets to that for gluon-initiated jets is the ratio of color factors $C_{A}/C_{F}=9/4$ . The fraction of quark jets increases with $p_{\text{T}}$ , which decreases the inclusive multiplicity. However, this is compensated by an inherent increase in the multiplicity with $p_{\text{T}}$ for both quark and gluon jets [73]. In addition to the mean, the full $(1/N_{\text{jet}})dN_{\text{jet}}/dn_{\text{ch}}$ distribution is measured.

Summed fragmentation function: The distribution of the momentum fraction $\zeta$ is studied inside jets summed over charged-hadron types. The quantity that is measured is $F(\zeta,p_{\text{T}}^{\text{jet}})=(1/N_{\text{jet}})dn_{\text{ch}}/d\zeta$ . In terms of the fragmentation function:

[TABLE]

By definition, $\int d\zeta F(\zeta)=\langle n_{\text{ch}}\rangle$ . In addition to measuring the distribution $F(\zeta)$ in bins of jet $p_{\text{T}}$ , summary statistics of the $F(\zeta)$ distribution are extracted to show how the distribution evolves with jet $p_{\text{T}}$ . The following properties of the $\zeta$ distribution are extracted:

•

Partial fractions of $F(\zeta)$ : $\int_{0}^{X}F(\zeta)d\zeta/\int F(\zeta)d\zeta=n_{\text{ch}}(\zeta<X)/n_{\text{ch}}$ to show how much of the jet energy is carried by particles of a given $p_{\text{T}}$ fraction. For illustration, the values considered are $X\in\{0.1,0.01,0.001\}$ . As $X\rightarrow 1$ , these partial fractions become a constant value of $1.0$ , independent of the jet $p_{\text{T}}$ .

•

Moments of $F(\zeta)$ : $\langle\zeta^{\kappa}\rangle=\int\zeta^{\kappa}F(\zeta)d\zeta/\int F(\zeta)d\zeta$ . The distribution of $F(\zeta)$ is nearly normally distributed in $\log(\zeta)$ , which means that it is defined by its first two moments [73]. For this reason, $\kappa=2$ is measured as a function of the jet $p_{\text{T}}$ . For illustration, the case $\kappa=1/2$ is also considered.

•

Weighted sums over the jet: $\langle\sum_{i\in\text{jet}}\zeta_{i}^{\kappa}\rangle=\int\zeta^{\kappa}F(\zeta)d\zeta$ . The values considered are $\kappa\in\{1/2,2\}$ . The observable $\sum_{i\in\text{jet}}\zeta_{i}^{2}$ is often called $p_{\text{T}}^{\text{D}}$ and can be used for quark/gluon jet tagging [74]. For a given jet type, these observables increase monotonically with increasing jet $p_{\text{T}}$ for $\kappa\lesssim 1$ and decrease monotonically for $\kappa\gtrsim 1$ (see Section 8.2); the $\kappa$ values chosen are representative of these trends.

Each of these derived quantities is extracted from the measured $F(\zeta)$ distribution. More details about the procedure for unfolding these derived quantities are presented in Section 6.

Transverse momentum: $p_{\text{T}}^{\text{rel}}\equiv p_{\text{T}}^{\text{charged particle}}\sin\Delta\phi$ , where $\Delta\phi$ is the angle between the momentum of the constituent charged particle and the jet axis in the transverse plane. The quantity that is measured is $f(p_{\text{T}}^{\text{rel}},p_{\text{T}}^{\text{jet}})=(1/N_{\text{jet}})dn_{\text{ch}}/dp_{\text{T}}^{\text{rel}}$ . The average value is defined by $\langle p_{\text{T}}^{\text{rel}}\rangle=\int p_{\text{T}}^{\text{rel}}f(p_{\text{T}}^{\text{rel}})/\int f(p_{\text{T}}^{\text{rel}})$ .

Radial profile: The number of charged particles in various annuli around the jet axis. The quantity that is measured is $\rho_{\text{ch}}(r,p_{\text{T}}^{\text{jet}})=(1/N_{\text{jet}})dn_{\text{ch}}/2\pi rdr$ , where $r=\Delta R(\text{charged particle},\text{jet})$ . The average value is defined by $\langle r\rangle=\int r\rho_{\text{ch}}(r)/\int\rho_{\text{ch}}(r)$ .

The last two quantities are not simple derivatives of the fragmentation function as they additionally depend on finite opening angles encoded in the $d\theta/\theta$ emission phase space. Since quantities are measured as a function of jet $p_{\text{T}}$ which is defined using charged and neutral particles, the observables are sensitive to the charged-to-neutral fraction inside jets. However, this fraction is robust to mis-modelling as isospin is an approximate symmetry of the strong force.

3 ATLAS detector

The ATLAS detector [75] at the LHC covers nearly the entire solid angle around the collision point. It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer incorporating three large superconducting toroidal magnets. The inner-detector system (ID) is immersed in a $2\text{\,}\mathrm{T}$ axial magnetic field and provides charged-particle tracking in the range $|\eta|<2.5$ .

The high-granularity silicon pixel detector covers the vertex region and typically provides four measurements per track, the first hit being normally in the IBL. It is followed by the silicon microstrip tracker (SCT) which usually provides eight measurements per track. These silicon detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to $|\eta|=2.0$ .

The calorimeter system covers the pseudorapidity range $|\eta|<4.9$ . Within the region $|\eta|<3.2$ , electromagnetic calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) calorimeters, with an additional thin LAr presampler covering $|\eta|<1.8$ , to correct for energy loss in material upstream of the calorimeters. Hadronic calorimetry is provided by the steel/scintillating-tile calorimeter, segmented into three barrel structures within $|\eta|<1.7$ , and two copper/LAr hadronic endcap calorimeters. The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimised for electromagnetic and hadronic measurements respectively.

Interesting events are selected to be recorded by the first-level trigger system implemented in custom hardware, followed by selections made by algorithms implemented in software in the high-level trigger [76]. The first-level trigger reduces the $40\text{\,}\mathrm{MHz}$ bunch crossing rate to below $100\text{\,}\mathrm{kHz}$ , which the high-level trigger further reduces in order to record events to disk at about $1\text{\,}\mathrm{kHz}$ .

4 Datasets and simulated samples

These measurements use the dataset of $pp$ collisions recorded by the ATLAS detector in 2016, corresponding to an integrated luminosity of 33 fb*-1* at a center-of-mass-energy of $\sqrt{s}=13$ TeV. Events are only considered if they are collected during stable beam conditions and satisfy all data quality requirements. Due to the high instantaneous luminosity and the large total inelastic proton–proton cross section, on average there are about 25 simultaneous (pileup) collisions in each bunch crossing.

The measurements presented in this paper use a variety of MC samples for estimating correction factors as well as for comparison with the corrected data. Dijet events were generated at leading order (LO) with Pythia 8.186 [77], with the $2\rightarrow 2$ matrix element convolved with the NNPDF2.3LO PDF set [78] and using the A14 tune of multiple-parton-interaction and shower parameters [6]. Pythia uses a $p_{\text{T}}$ -ordered parton shower model. Additional dijet events were simulated using different generators, in order to study the impact of modeling uncertainties. Sherpa 2.1 [79] events were generated using multi-leg $2\rightarrow 2$ and $2\rightarrow 3$ matrix elements, which were matched to parton showers following the CKKW prescription [80]. These Sherpa events were simulated using the CT10 PDF set [81] and the default Sherpa event tune. Herwig++ 2.7 [82, 83] was used to provide a sample of events with an angle-ordered parton shower model. These events were generated with the $2\rightarrow 2$ matrix element, convolved with the CTEQ6L1 PDF set [84] and configured with the UE-EE-5 tune [85].

All simulated events were passed through a full simulation of the ATLAS detector [86] implemented in Geant 4 [87], which describes the interactions of particles with the detector and the subsequent digitization of analog signals. The effects of multiple simultaneous $pp$ collisions were simulated with inelastic $pp$ collisions using the Pythia 8.186 generator with the A2 [88] set of tuned parameters and the MSTW2008LO [89] PDF set; these events were overlaid on the nominal dijet events.

5 Object and event selection

Since the data are unfolded to particle level, it is necessary to define both the particle-level and detector-level objects used in the measurement. The former are chosen to be as close as possible to the latter in order to minimize the model dependence caused by an extrapolation from the measured phase space at detector level to the phase space at particle level. Section 5.1 describes the definition of charged-particle tracks and jets. Following the discussion of objects, Section 5.2 describes the particle-level and detector-level event selection criteria.

5.1 Object reconstruction

While it is not possible to separate the underlying event from the hard scatter at particle level, it is possible to remove the contribution from pileup. Therefore, the unfolding target is particle-level distributions produced in single proton–proton interactions. However, at detector level, there is ambiguity about which $pp$ collision vertex corresponds to the hard-scatter event. Collision vertices are reconstructed from tracks in the inner detector. Each vertex is required to be associated with at least two tracks with $p_{\text{T}}>0.4$ GeV. The primary hard-scattering vertex of the event is chosen to be the one with the highest $\sum p_{\text{T}}^{2}$ calculated using all tracks associated with the vertex.

Particle-level jets are built from MC-simulated stable particles ( $c\tau>10$ mm) excluding muons and neutrinos. By definition, particles from pileup and from interactions with the detectors are not included. These jets are clustered using the anti- $k_{t}$ [90] algorithm with radius parameter $R=0.4$ as implemented in FastJet [91]. Detector-level jets are built from topological calorimeter-cell energy clusters [92] using the same algorithm as is used at particle level. A series of simulation- and data-based correction and calibration factors are applied to ensure that the resulting jet $p_{\text{T}}$ is the same as the particle-level value on average [93]. Jets are required to have $p_{\text{T}}>60$ GeV so that the rate of jets originating from pileup is negligible. The detector-level phase space includes one bin at low jet $p_{\text{T}}$ (60–100 GeV) which is not in the fiducial phase space of the measurement due to the large impact of migrations into and out of the acceptance.

Charged particles are used to compute the particle-level definitions of all observables if they are clustered within a particle-level jet and have $p_{\text{T}}>500$ MeV and $|\eta|<2.5$ . The detector-level analog to charged particles is tracks. Tracks are reconstructed from hits in the inner detector (see e.g. Ref. [67]) and a series of quality criteria are applied to the selected tracks to reject those originating from hits due to multiple charged particles (fake tracks) and from pileup. The transverse momentum resolution is approximately $\sigma(p_{\text{T}})/p_{\text{T}}\approx 0.05\%\times p_{\text{T}}/\text{Ge\kern-1.00006ptV}\oplus 1\%$ , with a significant degradation in the core of high- $p_{\text{T}}$ jets due to challenges associated with pattern recognition.222For example, the $p_{\text{T}}$ resolution is approximately 30% at 100 GeV when five or more particles are within $\Delta R<0.015$ . Tracks are required to pass the tight primary selection as well as the loose track-to-vertex association [94]. In particular, they must have $p_{\text{T}}>500$ MeV and $|\eta|<2.5$ , and the number of pixel and strip clusters associated with the track is required to be at least 9 (11) for $|\eta|<1.65$ ( $\geq 1.65$ ). In addition, the transverse impact parameter $d_{0}$ relative to the beamline must be less than 2 mm and the longitudinal impact parameter, $z_{0}$ , is required to satisfy $|z_{0}\sin\theta|<3$ mm. Tracks are matched to jets via ghost association [95]. This matching procedure creates ghost versions of the tracks with the same direction but infinitesimal $p_{\text{T}}$ . Jet clustering is repeated and tracks are assigned to the jet that contains their ghosted version. For the isolated, high $p_{\text{T}}$ jets used in this measurement, ghost association is nearly identical to a geometric matching based on $\Delta R<0.4$ .

5.2 Event selection

Particle-level events are required to have at least two jets with $|\eta|<2.1$ (within the tracking detector acceptance) and the leading two such jets must satisfy $p_{\text{T}}^{\text{lead}}/p_{\text{T}}^{\text{sublead}}<1.5$ . This jet- $p_{\text{T}}$ balance requirement simplifies the interpretation of the final state in terms of a $2\rightarrow 2$ scattering process.

Detector-level events are selected using single-jet triggers. Due to the large cross section for jet production, most of the jet triggers are prescaled: events that pass the trigger are randomly discarded with a fixed probability. The trigger used for a particular jet $p_{\text{T}}$ is chosen to ensure that the trigger is 100% efficient (for the measurement phase space and prior to prescaling) and has the lowest prescale factor. Events in data are weighted by the prescale. The lowest-threshold unprescaled jet trigger is used for jets with $p_{\text{T}}>600$ GeV. Detector-level events are required to pass the same selection requirements as particle-level events: there must be at least two offline calibrated jets with $|\eta|<2.1$ and the leading two of these jets must satisfy $p_{\text{T}}^{\text{lead}}/p_{\text{T}}^{\text{sublead}}<1.5$ . Figure 1 shows the basic kinematic properties of the two leading jets passing this event selection compared with various MC predictions at detector level.

The substructure of the two leading jets is used in the analysis. Figure 2 shows detector-level distributions for a selection of the observables that were introduced in Section 2. For the jets with $p_{\text{T}}\sim 1$ TeV shown in Figure 2, the most probable number of tracks is about 15 and the most probable momentum fraction is about 1%. The radiation pattern is peaked at the center of the jet, so both the $p_{\text{T}}^{\text{rel}}$ and $r$ distributions are peaked at zero. The Pythia, Herwig++, and Sherpa distributions generally bracket the data and are accurate to within about 20%.

In order to expose differences between quark and gluon jets, the more forward and more central of the two jets are distinguished and measured separately. Figure 3 shows the gluon-jet fraction as a function of jet $p_{\text{T}}$ and jet $\eta$ (more details about quark/gluon definitions are given in Section 8.2). For a fixed jet $p_{\text{T}}$ , higher- $|\eta|$ jets are more often quark-initiated due to valence quarks scattering off gluons. For a fixed $\eta$ , the quark fraction increases with jet $p_{\text{T}}$ due to the relative increase in valence-quark scattering off a quark or gluon compared with gluon–gluon scattering.

Table 1 summarizes the object and event selections from Section 5.1 and Section 5.2.

6 Unfolding

The data are corrected for resolution and acceptance effects, and the fiducial phase space of the measurement is described by the particle-level object and event selection in Section 5. Equation (2) symbolically summarizes the unfolding procedure for a binned distribution $x$ :

[TABLE]

where $n_{\text{jets, unfolded}}$ is the unfolded number of forward or central jets (depending on the bin), determined by the number of entries in the $n_{\text{ch}}$ unfolding (as there is one entry per jet). The symbols $\theta$ and $\epsilon$ represent the unfolding matrix and correction factors, described in more detail below.

The jet substructure observables are simultaneously unfolded with the jet $p_{\text{T}}$ and for the more forward and the more central jets at the same time. For an observable with $n_{\text{bins}}$ bins in a given $p_{\text{T}}$ bin, this results in a total of $N_{\text{total}}=2\times(n_{\text{bins}})\times(p_{\text{T}}\text{ bins})$ bins. All of these bins are concatenated to form a one-dimensional input. To begin the unfolding, the data are corrected for the fraction of events that pass the detector-level selection but not the particle-level selection, $\epsilon_{\text{reco not true}}$ . This also corrects for non-dijet events, but their rate is negligible. Then, an iterative Bayesian (IB) unfolding technique [96] is used as a regularized matrix inversion to correct for the detector resolution in events that pass both the detector-level and particle-level selections. The IB method is implemented in the RooUnfold framework [97] with the unfolding matrix $\theta$ and one iteration, is chosen to minimize the total uncertainty. After the application of the response matrix, a final correction is applied to account for the fraction of events that pass the particle-level but not detector-level selection, $\epsilon_{\text{true not reco}}$ . The resulting unfolded measurement is reorganized into individual distributions with $n_{\text{bins}}$ per $p_{\text{T}}$ bin for each of the more forward and more central jet. The jet $p_{\text{T}}$ is also unfolded in parallel and each $p_{\text{T}}$ bin of the jet substructure observable is normalized by the number of measured jets in that bin. For $n_{\text{ch}}$ , this renders the distributions normalized to unity per jet $p_{\text{T}}$ bin; for the other observables, the normalization in each $p_{\text{T}}$ bin is (up to acceptance effects) $\langle n_{\text{ch}}\rangle$ , as discussed in Section 2.

To illustrate the jet- $p_{\text{T}}$ dependence of the measured observables, the evolution with the jet $p_{\text{T}}$ of various moments ( $\kappa$ ) is computed using Eq. (3):

[TABLE]

where the sum is over all $i$ that correspond to $p_{\text{T}}$ bin $j$ . Since the bin center is used to calculate the average, a correction $c_{\text{binning}}$ is applied to account for the difference between the bin center and the mean of the distribution within the bin. This correction is calculated using Pythia, and is computed by reweighting Pythia so that it agrees with the unfolded distribution. For $\zeta$ , $p_{\text{T}}^{\text{rel}}$ , and $r$ , Eq. (3) represents the $\kappa$ moment for individual particles. For $\zeta$ , the jet-based moments are also computed: $\langle\sum_{\text{$ i\in\text{jet} $}}\zeta_{i}^{\kappa}\rangle$ . For these jet-based moments, Eq. (3) is modified by removing the denominator $\sum_{i=1}^{n_{\text{bins}}}x_{\text{unfolded,$ i $}}$ . By construction, the $\kappa=0$ jet-based moment of $\zeta$ is the $\kappa=1$ moment of $n_{\text{ch}}$ . The binning correction factor is mostly near unity, deviating by less than 1% for $n_{\text{ch}}$ and up to about 10% for the other observables.

Figure 4 shows the response matrix normalized per particle-level bin. As stated above, the observable bins are concatenated with the jet $p_{\text{T}}$ and for both the more forward and more central jets to form a one-dimensional distribution that is unfolded. A diagonal stripe represents events where the detector-level jet $p_{\text{T}}$ is the same as the particle-level value; off-diagonal components represent jet $p_{\text{T}}$ migrations. Within a jet $p_{\text{T}}$ bin, there is a small dependence on $\zeta$ , with a worse resolution at high $\zeta$ due to the deteriorating momentum resolution at high track $p_{\text{T}}$ . The diagonal strips in the upper left and lower right quadrants correspond to events where the more forward particle-level jet is the more central detector-level jet and vice versa. This migration happens in about 1% of events. Within a given jet $p_{\text{T}}$ bin, the migrations to neighboring $\zeta$ bins are small. Except at high $\zeta$ and high jet $p_{\text{T}}$ where the migrations can reach 50%, the off-diagonal components of the response matrix are about 10%.

7 Uncertainties

Systematic and statistical uncertainties are assessed for each step of the analysis, including the acceptance correction factors, response matrix, and unfolding method. For each uncertainty, some component of the analysis chain is varied and then the entire unfolding procedure is repeated. Data and simulation statistical uncertainties are determined from pseudo-experiments using the bootstrap method [98]. The details of the experimental systematic uncertainties related to track and jet reconstruction are given in Section 7.1 and the uncertainties in the unfolding method and fragmentation modeling are described in Section 7.2. An additional source of uncertainty arising from binning effects is evaluated when computing the average value of an observable as a function of jet $p_{\text{T}}$ . The average values are determined using the bin centers, so the correction described in Section 6 relies on the simulation for the distribution within a given bin. An uncertainty in the binning correction is estimated by comparing the correction factors derived from Pythia with those from Sherpa, where both simulations are reweighted to match the unfolded data distribution.

Figure 5 provides an overview of the systematic uncertainties for a selection of observables, using the average value versus $p_{\text{T}}$ for illustration. The uncertainty in the rate of fake and secondary tracks is the leading experimental reconstruction uncertainty for $\langle n_{\text{ch}}\rangle$ and $\langle r\rangle$ except at low jet $p_{\text{T}}$ where the uncertainties from the inclusive tracking efficiency and the unfolding procedure are larger. The jet energy uncertainties are the most important for $\zeta$ , with the tracking uncertainties matching in size in the highest jet- $p_{\text{T}}$ bins. The tracking and jet energy uncertainties are about the same size for $\langle p_{\text{T}}^{\text{rel}}\rangle$ . Fragmentation modeling uncertainties are large for $\langle n_{\text{ch}}\rangle$ at low jet $p_{\text{T}}$ and for $\langle\zeta\rangle$ at high jet $p_{\text{T}}$ . While the size of the binning correction uncertainty is less than 2% for $\langle p_{\text{T}}^{\text{rel}}\rangle$ and $\langle r\rangle$ , it is still the dominant uncertainty for these observables. Further details about each uncertainty source are given below and the full covariance matrices, including all correlation information, are made available in Ref. [99].

7.1 Track and jet reconstruction

Except for $\zeta$ , the jet energy is only used to determine the $p_{\text{T}}$ bin. Since the fragmentation properties vary slowly with jet $p_{\text{T}}$ , the resulting impact of jet energy scale and resolution uncertainties on the analysis is often less important than other sources of uncertainty. Nonetheless, the impact of a 19-parameter decomposition of the jet energy scale uncertainty was evaluated [93]. Six of these 19 components are due to in situ constraints on the jet energy scale from various multi-object balance studies, such as $Z$ +jets. Additional sources of uncertainty are related to pileup, jet flavor, and extrapolations to high $p_{\text{T}}$ . The total uncertainty in the jet energy scale is about 1% for jets with $p_{\text{T}}$ between 100 and 1000 GeV and the impact on this measurement is much less than 1% except at high $\zeta$ , where it can reach as high as 2%. The impact of the jet energy resolution is determined from an ensemble of event samples with jet energies smeared within the uncertainty.

The most important experimental uncertainties are related to track reconstruction and cover the track reconstruction efficiency, the rate of fake and secondary tracks, the momentum scale, and density effects from pixel and strip cluster merging. In the Pythia simulation, approximately 60% of the charged particles / tracks inside jets are charged pions that are well matched,333Reconstructed tracks are matched to charged particles by examining the pattern of sensors where energy was deposited. If over 50% of the weighted number of measurements on a track are due to one charged particle, it is declared matched to the track. The weights are chosen to reflect the amount of information present in each detector and are ten for the pixel detector, five for the strip detector, and one for the straw tube tracker. 10% are well-matched kaons, 5% are well-matched protons, 15% are charged particles that are not matched to reconstructed tracks (inefficiency), 5% are secondaries (split equally between photon conversions and nuclear interactions), 1% are not well-matched tracks (fake tracks), and about $\mathcal{O}(0.1\%)$ are pileup tracks wrongly matched to the primary hard-scattering vertex. The pileup contribution decreases with jet $p_{\text{T}}$ and momentum fraction, but increases with jet cone size (reaching 1% at $\Delta$ = 0.4). In contrast, the fake-track rate increases slightly with jet $p_{\text{T}}$ and has a contribution at high-momentum fraction of a few percent from kinked tracks reconstructed with a very high $p_{\text{T}}$ . The reconstruction inefficiency grows with jet $p_{\text{T}}$ , and is peaked at both low and high radial distance from the center of the jet and is reduced at high momentum fraction. This is because tracks with a larger radial distance from the jet axis tend to have lower $p_{\text{T}}$ (larger material effects and thus lower efficiency), while tracks in the core of the jet suffer from an inefficiency in the pattern recognition in the dense environment.

The uncertainty in the inclusive track reconstruction efficiency is dominated by the uncertainty in the amount of material in the inner detector. Variations in the amount of material that are consistent with detector construction knowledge and measurements from secondary vertices [100] result in an uncertainty of 0.5% for $|\eta|<0.1$ , which grows to 2.7% for $2.3<|\eta|<2.5$ . This uncertainty is applied in the simulation by randomly removing tracks with a $p_{\text{T}}$ - and $|\eta|$ -dependent probability. This uncertainty dominates the $n_{\text{ch}}$ measurement for jet $p_{\text{T}}\lesssim 1$ TeV.

Since the ATLAS pixel detector measures the charge collected from ionization, it is possible to constrain the inefficiency from density effects by looking for single tracks with pixel charge consistent with two minimum-ionizing particles [67]. The resulting uncertainty is about 0.4% for tracks with $\Delta R<0.1$ and is validated with additional studies related to the charged-to-neutral ratio in the jet as well as the geometric orientation of pixel clusters [68]. This uncertainty is most important for the radial energy measurement at small radii from the jet axis and for the $n_{\text{ch}}$ measurement in the highest jet- $p_{\text{T}}$ bins.

The rate of fake tracks is studied inside jets by inverting some of the track quality criteria such as the fit $\chi^{2}/\text{NDF}$ and is found to agree between data and simulation at the 30% level [68]. A related source of uncertainty is due to the rate of secondary tracks. These tracks originate from real charged particles, but are the result of interations in detector material and not direct fragmentation processes. The rate of secondaries is estimated by fitting the track $d_{0}$ distribution and is found to agree with simulation within about 30%. These rates are then varied to determine an uncertainty in the measurement. The fake-track rate is the leading source of uncertainty for $n_{\text{ch}}$ when $p_{\text{T}}\gtrsim 1$ TeV and when $\zeta\sim 1$ or $r\lesssim 0.05$ for all jet- $p_{\text{T}}$ bins. Uncertainties related to the modeling of pileup have a negligible impact.

The leading source of uncertainty in the track parameters is in the $q/p_{\text{T}}$ ( $q$ is the electric charge) from a potential sagitta distortion due to detector misalignment weak modes [94]. This bias is corrected and the uncertainty in the correction is about 0.1/TeV except at $\phi\approx 0$ and $|\eta|\sim 2.5$ where the correction can reach 1/TeV. The impact on the measurement is smaller than the other tracking uncertainties.

7.2 Unfolding method and fragmentation modeling

An uncertainty resulting from the unfolding method described in Section 6 is determined by unfolding the prediction from a reweighted simulation with the nominal procedure. The reweighted simulation is constructed by modifying the nominal Pythia 8 particle-level spectrum so that the simulated detector-level spectrum, from propagating the reweighted particle-level spectrum through the response matrix, has significantly improved agreement with the data. The modified detector-level distribution is unfolded with the nominal response matrix and the difference between this and the reweighted particle-level spectrum is an indication of the bias due to the unfolding method (in particular, the choice of prior) [101]. The weights are chosen by comparing the Pythia 8 particle-level spectrum with the unfolded data. After applying the reweighting, the $\chi^{2}/\text{NDF}$ calculated using only the statistical uncertainties improves significantly in each jet $p_{\text{T}}$ bin. The resulting systematic uncertainties are generally much smaller than the detector-level differences between the data and simulation, as desired.

The unfolded result depends on the modeling of jet fragmentation through the prior, the response matrix, and the correction factors. Variations in the prior are already accounted for in the data-driven non-closure uncertainty described above. The other contributions are evaluated by comparing the result using Pythia 8 with the result using the alternative Herwig++ sample described in Section 4. A similar uncertainty is obtained when using Herwig++ or Sherpa as the alternative model. This comparison is decomposed into components corresponding to varying only the response matrix or only the initial/final correction factors, $\epsilon_{\text{reco not true}}$ and $\epsilon_{\text{true not reco}}$ in Eq. (2). All three components are added in quadrature to determine the total uncertainty due to fragmentation modeling. Even though these sources of uncertainty are correlated, they were treated as independent because the level of correlation is unknown given that there are only two alternative models. The resulting uncertainty is much smaller than the difference between Pythia 8 and Herwig++ at particle level. For $n_{\text{ch}}$ , the response matrix is the dominant contribution to this uncertainty, except in the first jet- $p_{\text{T}}$ bin where the correction factors and their uncertainty are also important. For the per-particle observables ( $\zeta$ , $r$ , $p_{\text{T}}^{\text{rel}}$ ), the correction factors dominate the uncertainty because acceptance effects are much more important.

8 Results

The unfolded data are presented in two ways. Section 8.1 focuses on the inclusive spectra for both jets together, while Section 8.2 uses the differences between forward and central jets to determine the unique features of quark-initiated and gluon-initiated jets, some of which can be compared with perturbative QCD calculations. These sections show a selection of jet $p_{\text{T}}$ bins; a complete set of results can be found in Ref. [99].

8.1 Inclusive distributions

The unfolded averages of the measured observables are presented as a function of the jet $p_{\text{T}}$ in Figure 6 for the more forward and more central jets separately and then combined in Figure 7. All other figures in this section combine measurements of both jets. The more central jets show properties that are more gluon-like than the more forward jets: they have a larger charged-particle multiplicity and a softer momentum-fraction spectrum. The modeling of the all-jet spectra is very similar to that of the more forward/backward jets and is described in detail for the all-jet spectra only.

As the jet $p_{\text{T}}$ increases, the average charged-particle multiplicity increases, the average momentum fraction decreases, the average $p_{\text{T}}^{\text{rel}}$ increases, and the average multiplicity-weighted radius decreases. Charged-particle multiplicity increases from about 10 at jet $p_{\text{T}}$ of 100 GeV to just over 20 at 2.5 TeV. In most cases, Pythia 8 and Sherpa bracket the data, and are accurate to better than 10%; Herwig++ is often between these two and closer to the data. As the distribution of $n_{\text{ch}}$ is almost Poissonian, nearly all of the information about the distribution is encoded in the mean value. In contrast, the distribution of $\zeta$ is more complicated.444The distribution is nearly Gaussian in $\log\zeta$ , so it is well specified by two parameters instead of one [73]. The average momentum fraction is about 5% at jet $p_{\text{T}}$ of 100 GeV and decreases to about 2.5% at 2.5 TeV (the most probable value, shown below, is lower). The distributions of $p_{\text{T}}^{\text{rel}}$ and the radial profiles fall steeply (nearly exponentially) away from zero and the average values in Figure 7 give a sense of how fast they fall (exponential distributions are uniquely specified by their mean). The average $p_{\text{T}}^{\text{rel}}$ at $p_{\text{T}}^{\text{jet}}=100$ GeV is about 0.35 GeV and increases to about 0.55 GeV at $p_{\text{T}}^{\text{jet}}=2.5$ TeV. If the angular distribution about the jet axis is independent of $p_{\text{T}}$ , the average value of $p_{\text{T}}^{\text{rel}}$ should be proprotional to $\langle\zeta\rangle(p_{\text{T}}^{\text{jet}})\times p_{\text{T}}^{\text{jet}}$ . This would suggest an increase by a factor of $(2.5\%/5\%)\times(2500/100)\sim 12.5$ across the measured range; instead it only increases by a factor of about 1.5. This means that the angular distribution is not independent of $p_{\text{T}}$ and in particular, the jets become more collimated. This is also consistent with direct measurement of the radial profile, where the average value drops from about 0.06 at $p_{\text{T}}^{\text{jet}}=100$ GeV to about 0.03 at $p_{\text{T}}^{\text{jet}}=2.5$ TeV. While Pythia 8, Sherpa, and Herwig++ agree well with the data for $p_{\text{T}}^{\text{rel}}$ , Sherpa provides a poorer model of the average radial profile as a function of the jet $p_{\text{T}}$ .

As noted above, the distribution of $\zeta$ cannot be described simply by its average value, in contrast to $n_{\text{ch}}$ , $p_{\text{T}}^{\text{rel}}$ and $r$ , which are nearly Poisson or exponentially distributed. Therefore, it is useful to summarize the $p_{\text{T}}^{\text{jet}}$ dependence of other aspects of the $\zeta$ distribution. Figure 8 shows partial integrals of the $\zeta$ distribution and Figure 9 shows the average values of $\zeta^{1/2}$ , $\zeta^{2}$ , $\sum_{i\in\text{jet}}\zeta^{1/2}$ and $\sum_{i\in\text{jet}}\zeta^{2}$ . Figure 8 illustrates how the average fraction of charged particles with a given momentum fraction evolves with $p_{\text{T}}^{\text{jet}}$ . There is no correction for binning effects, as the measured $\zeta$ distribution has bin edges which nearly align with $0.1\%$ , 1%, and 10%. In particular, the $\zeta$ bins are $1/1.5^{n}$ , for $n=0,\ldots,21$ , and the fractions in Figure 8 are estimated as $0.1\%\approx 1/1.5^{17}$ , $1\%\approx 1/1.5^{11}$ , and $10\%\approx 1/1.5^{5}$ . The fraction of particles carrying 10% or less of the momentum changes very little across the entire $p_{\text{T}}^{\text{jet}}$ range and is also near unity ( $>90\%$ for all $p_{\text{T}}^{\text{jet}}$ ). A strong $p_{\text{T}}^{\text{jet}}$ dependence is introduced when the $\zeta$ threshold is lowered to 1% and to 0.1%. Since charged particles are required to have $p_{\text{T}}>500$ MeV, only jets with $p_{\text{T}}^{\text{jet}}>500$ GeV can have particles with $\zeta<0.1\%$ . The fraction of particles with $\zeta<1\%$ has a logarithmic increase while the fraction of particles with $\zeta<0.1\%$ appears to increase faster than linearly with $p_{\text{T}}^{\text{jet}}$ . Both of these general trends are reproduced by Pythia 8, Sherpa, and Herwig++, although for example, Pythia 8 disagrees with the exact value at low $p_{\text{T}}^{\text{jet}}$ for $\zeta<1\%$ and all $p_{\text{T}}^{\text{jet}}$ for $\zeta<0.1\%$ . For the $\zeta<0.1\%$ case, Sherpa and Pythia 8 bracket the data, with Sherpa predicting more particles with a lower $\zeta$ fraction, while Herwig++ is much closer to the data. The average values of $\sqrt{\zeta}$ and $\zeta^{2}$ for individual particles as a function of jet $p_{\text{T}}^{\text{jet}}$ in the top panel of Figure 9 show a decreasing trend that is qualitatively similar to the trend for the average $\zeta$ in Figure 7. For $\sqrt{\zeta}$ , Pythia 8/Herwig++ and Sherpa bracket the data, although Pythia 8 agrees with the data within the uncertainty. Sherpa predicts a significantly higher average $\zeta^{2}$ than is present in the data. As with $\langle n_{\text{ch}}\rangle$ , the average value of $\sum_{i\in\text{jet}}\zeta^{1/2}$ increases with the jet $p_{\text{T}}$ , while the $p_{\text{T}}$ dependence of $\langle\sum_{i\in\text{jet}}\zeta^{2}\rangle$ is more complicated as it first decreases and then slowly increases with jet $p_{\text{T}}$ . This trend is well reproduced by Pythia and Herwig++, but not by Sherpa.

To present more differential information, the full unfolded distributions for $n_{\text{ch}}$ , $\zeta$ , $p_{\text{T}}^{\text{rel}}$ , and $r$ are shown in Figures 10, 11, 12, and 13, respectively, for representative $p_{\text{T}}^{\text{jet}}$ bins. Many of the relevant trends are captured in the above discussion about the $p_{\text{T}}^{\text{jet}}$ dependence of the moments. However, finer information that may be useful for generator tuning is provided by the differential distributions.

8.2 Quark and gluon distributions

As discussed in Section 5.2, the more forward and the more central of the two selected jets can be separated to study differences between the radiation patterns within quark and gluon jets. Using the fraction of quark jets $f_{q}$ in the two jet samples (forward $f$ and central $c$ ), one can extract the quark ( $h_{i}^{q}$ ) and gluon ( $h_{i}^{g}$ ) jet fragmentation properties separately by solving a system of equations per bin $i$ of an observable:

[TABLE]

where $f_{q}^{x}$ is the fraction of quark jets in sample $x$ (see Figure 3 for the gluon fraction) and the nominal fractions are taken from the default Pythia simulation described in Section 4. The flavor of a jet is defined as the type of the highest-energy parton from the event record (all partons prior to hadronization) matched to the jet via ghost association. This definition is not unique because quark and gluon labels are not universal due to color connections with other partons in the event.555However, for isolated jets, the topology dependence is predicted to be much smaller than the difference between quark and gluon jets [102]. In addition to the uncertainty in $h_{i}^{f}$ and $h_{i}^{c}$ from the unfolding method, uncertainties in the extracted $h_{i}^{g}$ and $h_{i}^{q}$ distributions arise from the PDF choice, from the matrix elements, from the fragmentation model (due to flavor changing), and from the method non-closure. The determination of the uncertainty from the choice of PDF uses the NNPDF uncertainty set (NNPDF 2.3 at LO in QCD and QED with $\alpha_{\text{S}}(m_{Z})=0.119$ ) and the matrix-element uncertainty is estimated by comparing the nominal fractions from Pythia with those from Herwig.666These two generators also use different PDF sets, so this uncertainty is double-counted in the overall uncertainty. The non-closure uncertainty is due to the small (sub-percent level) differences between forward and central quark jets, as well as forward and central gluon jets, resulting from an $\eta$ dependence in the jet fragmentation at a fixed jet $p_{\text{T}}$ [102]. When presenting the average properties in bins of jet $p_{\text{T}}$ , the binning correction described in Section 6 is also applied and the corresponding uncertainty contributes to the total uncertainty (though it is smaller than other sources of uncertainty).

The matrix-element uncertainty dominates the total uncertainty in the extraction procedure, resulting in an uncertainty that is about 1% at high jet $p_{\text{T}}$ and about 5% at low to moderate jet $p_{\text{T}}$ for quark jets, with the inverse trend for gluon jets (low uncertainty at low jet $p_{\text{T}}$ and large uncertainty at high jet $p_{\text{T}}$ ). The extractions presented here use leading-order matrix elements and leading-logarithm parton shower programs; higher-order effects that modify the fractions $f$ are not included in this leading-order extraction. Figure 14 shows the extracted quark and gluon distributions for jets with $1000$ GeV $<p_{\text{T}}^{\text{jet}}<1200$ GeV. To reinforce the simulation dependence of these extractions, the data distributions are referred to as ‘extracted quark-like’ and ‘extracted gluon-like’.

A key challenge with the extraction based on Eqs. (4) and (5) is that it strongly depends on simulation for the fractions $f_{q}$ and $f_{g}$ . A new approach that does not require the input of any fractions is topic modeling [103, 104], which holds great promise for learning about quark- and gluon-like jets with less input from theory. In this approach, one can extract distributions of ‘topics’ $T_{1}$ and $T_{2}$ :

[TABLE]

In the limit that $\text{min}_{j}\{h^{g}_{j}/h_{j}^{q}\}=\text{min}_{j}\{h^{q}_{j}/h^{g}_{j}\}=0$ , $h^{T_{1}}=h^{q}$ and $h^{T_{2}}=h^{g}$ . When this is not exactly the case, the topics are universal but not pure combinations of quarks and gluons. The extracted topics using $n_{\text{ch}}$ in two jet $p_{\text{T}}$ bins are shown in Figure 15. The very low $n_{\text{ch}}$ region is dominated by quarks and the very high $n_{\text{ch}}$ region is dominated by gluons and therefore $n_{\text{ch}}$ nearly has the property that $\text{min}_{j}\{h^{g}_{j}/h_{j}^{q}\}\approx\text{min}_{j}\{h^{q}_{j}/h^{g}_{j}\}\approx 0$ . Therefore, the first topic is well aligned with quarks and the second topic is more gluon-like. This alignment is better for quarks than for gluons, but the second topic does converge to the gluon distribution at high jet $p_{\text{T}}$ . Other observables aside from $n_{\text{ch}}$ are not considered for topic modeling because there are no bins where $h^{g}_{j}/h_{j}^{q}=0$ or $h^{q}_{j}/h^{g}_{j}=0$ is approximately true and therefore the topics do not align with quark- and gluon-like quantities.

While the full quark and gluon distributions presented in Figure 14 cannot be predicted from perturbative QCD, it is possible to model the $p_{\text{T}}^{\text{jet}}$ dependence of the moments of the $\zeta$ distribution. Positive moments of the fragmentation function have a perturbative evolution with a proper $\alpha_{\text{S}}$ power series via DGLAP-like equations. In general, there are two terms that contribute to the right-hand side of Eq. (1) that prevent an analytic solution: one term proportional to $D_{g}^{h}$ and one term proportional to $D_{q}^{h}$ , where the coefficients for the $\kappa$ sums are the Mellin transforms $\tilde{P}_{p^{\prime}\leftarrow p}(\kappa)=\int_{0}^{1}d\zeta\zeta^{\kappa}P_{p^{\prime}\leftarrow p}(\zeta)$ for $p^{\prime}=g$ and $p^{\prime}=q$ , respectively. For gluon jets ( $p=g$ ), the $g\rightarrow qq^{\prime}$ splitting function is finite,777The splitting function $\tilde{P}_{q\leftarrow q}$ is also finite, but is not numerically small compared with $\tilde{P}_{g\leftarrow q}$ except when $\kappa$ is very small so this case is not considered further. so $|\tilde{P}_{g\leftarrow g}|\gg|\tilde{P}_{q\leftarrow g}|$ for $\kappa\neq 0.8$ where $\tilde{P}_{g\leftarrow g}$ switches sign. Therefore, away from $\kappa\approx 0.8$ and in the modified leading-logarithm approximation (MLLA)888This means resummation that includes the leading-order splitting functions and the first-order running of the strong coupling. A more refined calculation [19, 18] using SCET [16, 15, 14, 13] and fragmenting jet functions [105, 106, 107] is possible. However, the deviations from this simple approach are higher-order corrections and do not qualitatively change the comparisons in this section. [73, 108, 109],

[TABLE]

where $\beta_{0}$ is the first term in the QCD $\beta$ -function and $\Lambda$ is a non-perturbative parameter (of order $\Lambda_{\text{QCD}}$ ). The predictions are scaled to match the data in the sixth jet $p_{\text{T}}$ bin (referred to as the ‘anchor bin’). There is no a priori reason to select any particular bin as the anchor bin so one of the first bins after the lowest-threshold unprescaled jet trigger is selected. Figure 16 shows the distributions of the average $\sum_{i\in\text{jet}}\zeta_{i}^{\kappa}$ for $\kappa=0.5,1.0$ , and $2.0$ for gluon jets. As mentioned above, $\tilde{P}_{g\leftarrow g}$ is predicted to change sign at $\kappa=0.8$ , a trend which is supported by the data: for low $\kappa$ , the average value increases with $p_{\text{T}}$ and when $\kappa$ is large, the average decreases with $p_{\text{T}}$ . For $\kappa=1$ , momentum conservation and isospin symmetry predict that the average value of $\sum_{i\in\text{jet}}\zeta_{i}$ should be constant and approximately $2/3$ , the ratio of charged pions to all pions999The measured value is not exactly $2/3$ because a jet’s energy is only about 60% due to pions.. The leading-logarithm (LL) calculation predicts $\tilde{P}_{g\leftarrow g}(1)\approx 0$ so the $p_{\text{T}}$ dependence is already negligible compared with the $\kappa=0.5$ and $\kappa=2$ cases.

When $\kappa\rightarrow 0$ , both the quark and gluon fragmentation-function Mellin transforms diverge and so the analysis with Eq. (6) is not accurate. The $\kappa\rightarrow 0$ limit is $\langle n_{\text{ch}}\rangle$ and there is no known series in $\alpha_{\text{S}}$ to describe its $p_{\text{T}}^{\text{jet}}$ dependence. Despite this, the anomalous dimension for the $p_{\text{T}}^{\text{jet}}$ dependence of $\langle n_{\text{ch}}\rangle$ has been calculated to ‘N3LO’ where the series is in $\sqrt{\alpha_{\text{S}}}$ instead of $\alpha_{\text{S}}$ [22, 23]. Figure 17 shows $\langle n_{\text{ch}}\rangle$ as a function of $p_{\text{T}}^{\text{jet}}$ for both extracted quark-like and gluon-like jets as well as the topic extraction along with the prediction for the pure quark/gluon case. Gluon jets from data deviate significantly from simulation and from the calculation at high jet $p_{\text{T}}$ ; this is also true to a lesser extent for quark jets, which seem to have a different slope that is most prominent at low jet $p_{\text{T}}$ . A similar trend was first observed in Ref. [7], albeit with lower precision in the highest $p_{\text{T}}$ bins. There are several possibilities for this discrepancy, such as an unaccounted for potential source of bias in the quark/gluon jet fractions. The data in the right panel of Figure 17 do not yet conclusively support or reject this hypothesis; with more data, it may be possible to determine if the data match topic 2 in Pythia or deviate as is the case for gluons in the left panel of Figure 17.

The $p_{\text{T}}$ dependence of the average $\zeta$ , $p_{\text{T}}^{\text{rel}}$ , and $r$ are shown in Figure 18. Gluon jets have more constituents than quark jets on average so their average $\zeta$ is lower. For both quark and gluon jets, $\langle\zeta\rangle$ decreases with the jet $p_{\text{T}}$ in part because constituent multiplicity increases with $p_{\text{T}}$ . Gluon jets are wider than quark jets on average, but both quark and gluon jets become denser with increasing jet $p_{\text{T}}$ . The data show nearly the same trends as Pythia in all cases.

9 Conclusion

This paper documents a measurement of track-based jet fragmentation functions in $pp$ collisions at $\sqrt{s}=13$ TeV. The analysis uses a dataset corresponding to an integrated luminosity of 33 fb*-1* recorded by the ATLAS detector at the LHC. Multiple jet properties, including the charged-particle multiplicity, the momentum fraction carried by charged particles, and angular properties of the radiation pattern inside jets are studied. There are key areas where there are significant disagreements between the ATLAS default MC simulation (Pythia 8.2 with the A14 tune, Herwig++, and Sherpa) and the data, especially for the radial profiles and momentum distributions in Sherpa. The radial profile is systematically broader in data than in simulation, but the momentum transverse to the jet axis and the momentum fraction are well modeled within the precision of this measurement. Near 1 TeV in jet $p_{\text{T}}$ , these measurements have achieved percent-level uncertainties for a variety of observables. In addition to measuring the forward, central, and combined jet distributions, the forward and central jet spectra are considered separately to study quark- and gluon-like distributions. A first measurement of topic modeling for the charged-particle multiplicity provides a promising alternative to traditional methods of extracting quark- and gluon-jet distributions that use input from simulation. The simulations provide a reasonable description of the quark-like data across the jet $p_{\text{T}}$ range presented in this measurement, but the gluon-like data have systematically fewer charged particles than the simulations by about 10%.

The unfolded data are made public through HepData to provide input to help improve both perturbative and non-perturbative aspects of fragmentation modeling in the future.

Acknowledgments

We thank CERN for the very successful operation of the LHC, as well as the support staff from our institutions without whom ATLAS could not be operated efficiently.

We acknowledge the support of ANPCyT, Argentina; YerPhI, Armenia; ARC, Australia; BMWFW and FWF, Austria; ANAS, Azerbaijan; SSTC, Belarus; CNPq and FAPESP, Brazil; NSERC, NRC and CFI, Canada; CERN; CONICYT, Chile; CAS, MOST and NSFC, China; COLCIENCIAS, Colombia; MSMT CR, MPO CR and VSC CR, Czech Republic; DNRF and DNSRC, Denmark; IN2P3-CNRS, CEA-DRF/IRFU, France; SRNSFG, Georgia; BMBF, HGF, and MPG, Germany; GSRT, Greece; RGC, Hong Kong SAR, China; ISF and Benoziyo Center, Israel; INFN, Italy; MEXT and JSPS, Japan; CNRST, Morocco; NWO, Netherlands; RCN, Norway; MNiSW and NCN, Poland; FCT, Portugal; MNE/IFA, Romania; MES of Russia and NRC KI, Russian Federation; JINR; MESTD, Serbia; MSSR, Slovakia; ARRS and MIZŠ, Slovenia; DST/NRF, South Africa; MINECO, Spain; SRC and Wallenberg Foundation, Sweden; SERI, SNSF and Cantons of Bern and Geneva, Switzerland; MOST, Taiwan; TAEK, Turkey; STFC, United Kingdom; DOE and NSF, United States of America. In addition, individual groups and members have received support from BCKDF, CANARIE, CRC and Compute Canada, Canada; COST, ERC, ERDF, Horizon 2020, and Marie Skłodowska-Curie Actions, European Union; Investissements d’ Avenir Labex and Idex, ANR, France; DFG and AvH Foundation, Germany; Herakleitos, Thales and Aristeia programmes co-financed by EU-ESF and the Greek NSRF, Greece; BSF-NSF and GIF, Israel; CERCA Programme Generalitat de Catalunya, Spain; The Royal Society and Leverhulme Trust, United Kingdom.

The crucial computing support from all WLCG partners is acknowledged gratefully, in particular from CERN, the ATLAS Tier-1 facilities at TRIUMF (Canada), NDGF (Denmark, Norway, Sweden), CC-IN2P3 (France), KIT/GridKA (Germany), INFN-CNAF (Italy), NL-T1 (Netherlands), PIC (Spain), ASGC (Taiwan), RAL (UK) and BNL (USA), the Tier-2 facilities worldwide and large non-WLCG resource providers. Major contributors of computing resources are listed in Ref. [110].

Bibliography111

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Andy Buckley “General-purpose event generators for LHC physics” In Phys. Rept. 504 , 2011, pp. 145–233 DOI: 10.1016/j.physrep.2011.03.005 · doi ↗
2[2] ATLAS Collaboration “Properties of jets measured from tracks in proton-proton collisions at center-of-mass energy s = 7 𝑠 7 \sqrt{s}=7 Te V with the ATLAS detector” In Phys. Rev. D 84 , 2011, pp. 054001 DOI: 10.1103/Phys Rev D.84.054001 · doi ↗
3[3] ATLAS Collaboration “Study of jet shapes in inclusive jet production in p p 𝑝 𝑝 pp collisions at s = 7 𝑠 7 \sqrt{s}=7 Te V using the ATLAS detector” In Phys. Rev. D 83 , 2011, pp. 052003 DOI: 10.1103/Phys Rev D.83.052003 · doi ↗
4[4] ATLAS Collaboration “Measurement of jet shapes in top-quark pair events at s 𝑠 \sqrt{s} = 7 Te V using the ATLAS detector” In Eur. Phys. J. C 73.12 , 2013, pp. 2676 DOI: 10.1140/epjc/s 10052-013-2676-3 · doi ↗
5[5] ATLAS Collaboration “Jet mass and substructure of inclusive jets in s = 7 𝑠 7 \sqrt{s}=7 Te V p p 𝑝 𝑝 pp collisions with the ATLAS experiment” In JHEP 05 , 2012, pp. 128 DOI: 10.1007/JHEP 05(2012)128 · doi ↗
6[6] ATLAS Collaboration “ATLAS Pythia 8 tunes to 7 Te V 7 Te V 7\leavevmode\nobreak\ \text{Te V} data”, ATL-PHYS-PUB-2014-021, 2014 URL: https://cds.cern.ch/record/1966419
7[7] ATLAS Collaboration “Measurement of the charged-particle multiplicity inside jets from s = 8 𝑠 8 \sqrt{s}=8 Te V p p 𝑝 𝑝 pp collisions with the ATLAS detector” In Eur. Phys. J. C 76.6 , 2016, pp. 322 DOI: 10.1140/epjc/s 10052-016-4126-5 · doi ↗
8[8] Daniel Reichelt, Peter Richardson and Andrzej Siodmok “Improving the simulation of quark and gluon jets with Herwig 7” In Eur. Phys. J. C 77 , 2017, pp. 876 DOI: 10.1140/epjc/s 10052-017-5374-8 · doi ↗