Minimising event size, maximising physics: inclusive particle isolation for LHCb’s Run 3

Marta Calvi; Tommaso Fulghesu; George Hallett; Luca Hartman; Basem Khanji; Veronica S. Kirsebom; Thomas Latham; Marion Lehuraux; Ching-Hua Li; Abhijit Mathad; Matthew Monk; Andy Morris; Matthew Scott Rudolph; Francesca Swystun; Dorothea vom Bruch

PMC · DOI:10.1140/epjc/s10052-026-15398-5·March 9, 2026

Minimising event size, maximising physics: inclusive particle isolation for LHCb’s Run 3

Marta Calvi, Tommaso Fulghesu, George Hallett, Luca Hartman, Basem Khanji, Veronica S. Kirsebom, Thomas Latham, Marion Lehuraux, Ching-Hua Li, Abhijit Mathad, Matthew Monk, Andy Morris, Matthew Scott Rudolph, Francesca Swystun, Dorothea vom Bruch

PDF

Open Access

TL;DR

This paper introduces a new algorithm to reduce data size in particle physics experiments without losing important physics information.

Contribution

The novel Inclusive Multivariate Isolation (IMI) algorithm improves data reduction while maintaining physics performance.

Findings

01

The IMI algorithm reduces data size by 45% while preserving full physics performance.

02

IMI achieves 99% efficiency in selecting signal particles across diverse decay topologies.

03

The algorithm is validated on real Run 3 data and shows robustness under actual conditions.

Abstract

The Run 3 of the LHC brings unprecedented luminosity and a surge in data volume to the LHCb detector, necessitating a critical reduction in the size of each reconstructed event without compromising the physics reach of the heavy-flavour programme. While signal decays typically involve just a few charged particles, a single proton–proton collision produces hundreds of tracks, with charged particle information dominating the event size. To address this imbalance, a suite of inclusive isolation tools have been developed, including both classical methods and a novel Inclusive Multivariate Isolation (IMI) algorithm. The IMI unifies the key strengths of classical isolation techniques and is designed to robustly handle diverse decay topologies and kinematics, enabling efficient reconstruction of decay chains with varying final-state multiplicities. It consistently outperforms traditional…

Figures20

Click any figure to enlarge with its caption.

Signal efficiency as a function of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q^2$$\end{document}$ for different isolation methods: IMI with output $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$> 0.05$$\end{document}$ (blue), *

The LHCb data flow during Run 3, illustrating the throughput and event rates for different streams, and indicating where the classical and IMI isolation algorithms are applied. Note that none of the Spruce selection lines using IMI rely on candidates preselected with the classical isolation tool. The quoted event rates include all isolation-based selection lines, and the bandwidths correspond to the maximum stream allocations from Ref. [[5](#CR5)]

Isolation workflow based on the Inclusive Multivariate Isolation (IMI) approachFig. 13(Left) Relative reduction in output file size (blue) and relative inference throughput (red) as a function of the IMI threshold. The throughput is evaluated in steps of 0.01 and, for display, the median is shown in bins of width 0.2; the shaded band indicates the standard deviation. Throughput is measured across all production LHCb sprucing lines relative to a baseline in which isolation is ignored and all extra particles are persisted. As expected, the throughput is only weakly dependent on the IMI threshold

Illustration of typical signal- and background-like topologies relevant to the IMI tool. The depicted $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B_s^{*} \rightarrow B K$$\end{document}$ decay is shown as an illustrative example and was used only in the development of the cut-based method. In the left diagram, the signal candidates include non-isolated particles originating from the decay vertex of a b-hadron

Distributions of (top) $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta R$$\end{document}$ and (bottom) $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log \!\bigl (\chi ^{2}_{\mathrm {IP\,w.r.t.\,SV}}\bigr )$$\end{document}$ f

Distributions of the IMI classifier output for non-isolated signal (red) and isolated background (green) particles in the training sample (filled histograms) and evaluation sample (markers). The strong agreement confirms the absence of overtraining and the model’s ability to generalise to unseen data

Signal efficiency (left) and fraction of charged particles accepted per event (right) as function of IMI threshold, computed on simulated $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B^{0} \rightarrow D^{*-} e^{+} \nu _{e}$$\end{document}$ events and compared to partial Run 3 data

Signal *candidate* efficiency versus background *track* rejection for the Track (teal), Cone (orange), Vertex (light blue), and IMI (dark blue) isolation methods. The top panel shows the performance on the inclusive MC sample; the red marker indicates the chosen IMI working point. The four lower panels (clockwise from top-left) show Track, Cone, IMI, and Vertex isolation, each evaluated on the inclusive sample (solid) and on three exclusive benchmark channels with increasing numbers of non-isolated signal particles (see Table [1](#Tab1)): $\documentclass[12pt]{minimal} \usepackage{amsmath} \us

Background rejection at a fixed signal efficiency of approximately $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99\%$$\end{document}$ for the Track (teal), Cone (orange), Vertex (light blue), and IMI (dark blue) isolation methods as a function of the number of reconstructed charged particles in the event (event multiplicity). The top panel shows performance on the inclusive simulated sample (see Table [1](#Tab1

Schematic illustration of the three classical isolation strategies: track isolation (top), based on large impact-parameter significance; cone isolation (middle), based on the track’s proximity to the signal candidate; and vertex isolation (bottom), based on compatibility with the signal decay vertex

SHAP summary plot for the IMI classifier [[38](#CR38)]. The y-axis lists input features in order of decreasing importance. The x-axis indicates each feature’s SHAP value, that is, its contribution to the classifier’s output, while the color encodes the raw feature value (green = low, red = high). Positive SHAP values push the classifier toward signal-like predictions, while negative values indicate background-like behaviour

Funding4

—http://dx.doi.org/10.13039/100014013UK Research and Innovation
—http://dx.doi.org/10.13039/100010663H2020 European Research Council
—http://dx.doi.org/10.13039/501100000271Science and Technology Facilities Council
—http://dx.doi.org/10.13039/100012496nccr – on the move

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParticle Detector Development and Performance · Particle physics theoretical and experimental studies · Radiation Detection and Scintillator Technologies

Full text

Introduction

A central challenge for the upgraded LHCb detector [1] during Run 3 (2022–2026) is to minimise the amount of data written to permanent storage while preserving the full physics potential of the experiment. The Large Hadron Collider (LHC) now delivers an instantaneous luminosity of up to $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {L}} = 2 \times 10^{33}\,{\textrm{cm}}^{-2}{\textrm{s}}^{-1}$$\end{document}$ to the LHCb experiment, a five-fold increase compared to Run 2 (2015–2018). In addition, the upgraded LHCb detector is fully read out at the 30 MHz non-empty bunch-crossing rate and is processed by a software-only High-Level Trigger (HLT) [2–4].

At this input rate, the HLT reconstructs and selects1 approximately $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(250~\text {kHz})$$\end{document}$ of physics events. These are subsequently written to tape at a total data rate of around $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(10~\text {GB}/\text {s})$$\end{document}$ [5]. These events are split across three data streams: Turbo, Turcal, and Full.

The Turbo stream, inspired by the model introduced in Run 2 [6, 7], stores only the online-reconstructed information relevant to the triggered signal, yielding a compact, analysis-ready format. It is used in two configurations: (i) minimal Turbo, which persists only the objects explicitly requested by the selection algorithms (often referred as selection lines), typically the signal candidate and essential metadata, and (ii) Turbo with selective persistency (Turbo(SP)), which additionally saves a controlled set of associated objects (e.g. vertices) according to per-line rules. The latter trades a modest increase in event size for greater offline flexibility. In contrast, the Full stream retains the entire reconstructed event, enabling detailed offline analysis. For instance, in semileptonic b-hadron decays (final states containing one or more leptons), controlling backgrounds from partially reconstructed decays is essential. In such cases, the Full stream enables direct reconstruction of these backgrounds, providing strong constraints on their shapes, yields, and associated systematics. The Turcal stream further supplements the data by including raw detector information required for calibration tasks (e.g. particle identification) Because of their richer event content, Full and Turcal dominate storage bandwidth. As our work focuses on physics analyses, we specifically target events recorded in the Full and Turbo streams.

To further reduce data volume, the Sprucing framework [8] performs a second, centralised selection step that reduces the total output rate to approximately 3.5 GB/s written to disk. For the Full stream in particular, the Technical Design Report (TDR) [5] mandates a reduction from 5.9 GB/s at the output of HLT2 to just 0.8 GB/s after sprucing [8]. Achieving this near eight-fold compression across $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(10^3)$$\end{document}$ Sprucing selection lines requires aggressive yet efficient pruning of event content.

The total data rate can be expressed as the product of two factors:

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {Data-rate}\,[\text {MB/s}] \propto \underbrace{\text {event rate}}_{{\text {Hlt/Sprucing}}}\,[\text {kHz}] \times \underbrace{\text {event size}}_{{\text {Hlt/Sprucing}}}\,[\text {kB}] . \end{aligned}$$\end{document}

Therefore, reduction in the data rate can be achieved by either lowering the number of events retained (event rate) or by decreasing the amount of information stored per event (event size). Extensive efforts by the LHCb physics and performance working groups have addressed both strategies, developing high-efficiency reconstruction [9] and selection algorithms to retain not only the most “interesting” events, but also signal candidates well-suited for fast offline analysis. This work advances the second component in Eq. (1), aiming to reduce the full event size through isolation algorithms that do more than just isolate signal decays, they selectively retain only the most relevant event information for detailed offline analysis.Fig. 1(Top) Event size breakdown for semileptonic Full stream events in LHCb, based on a minimum-bias Run 3 simulation. The base particles and metadata include a few particles from a partially reconstructed b-hadron decay (e.g., $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{D}^0$$\end{document}$ and a muon), together with primary vertex, trigger, and reconstruction metadata. The extra charged particle component covers all other charged tracks (Long, Downstream, Upstream) and RICH particle identification (PID) information. Additional VELO and T tracks contribute 10% and 15% of the event size, respectively, and are not shown as they do not contribute to the Semileptonic event size. The extra neutral particle component corresponds to reconstructed neutrals, such as photons and neutral hadrons. (Bottom) Track categories in LHCb, defined by the tracking sub-detectors where hits are recorded [1]

A typical event at LHCb contains reconstructed charged and neutral particles, along with associated information such as primary vertices, trigger decisions, and a reconstruction summary. Figure 1 (top) shows the size composition of a representative Full stream event used in semileptonic (SL) analyses. Only about 10% of the event size comes from metadata and the few particles of a partially reconstructed b-hadron decay. Neutral particles, mainly photons and neutral hadrons measured in the electromagnetic calorimeters, contribute roughly 35%. The largest share, nearly 55%, originates from reconstructed charged particles. As illustrated in Fig. 1 (bottom), these charged particles are classified into track categories according to the sub-detectors in which they leave hits. Long tracks are reconstructed with hits in the vertex locator (VELO), a high precision silicon detector surrounding the interaction region, and in the downstream tracking stations, providing the best momentum resolution and vertex association. Upstream tracks have hits in the VELO and upstream tracking stations only and typically correspond to low momentum particles that do not traverse the full spectrometer. Downstream and T tracks are reconstructed without VELO information and are characteristic of long lived decays such as $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K_S^0$$\end{document}$ or $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda ,$$\end{document}$ with T tracks using only downstream station hits. VELO tracks use hits only in the VELO and are essential for tracking and vertexing close to the interaction point, in particular for reconstructing primary vertices. Together with particle identification from the Ring Imaging Cherenkov detectors, the tracking information dominates the event size. For a more detailed information on the track types and their reconstruction, see Ref. [1]. This consideration leads us to the central question:

How can the relevant components of an event be identified efficiently, in a fast and inclusive manner, without introducing sensitivity to specific decay topologies or kinematic properties?*The core difficulty arises from pileup: multiple pp interactions per bunch crossing produce several primary (PVs) and secondary vertices (SVs), and the challenge is to associate reconstructed objects with the PV/SV that produced the signal hadrons. Inside the VELO, precise impact-parameter and vertexing information enable a largely geometric association to the correct vertex. Outside the VELO, however – after the magnet and into the downstream tracking stations, or when using RICH, calorimeter, or muon information – the association becomes far more difficult: longitudinal resolution degrades, magnetic deflection complicates back-extrapolation, and neutral objects lack track parameters altogether. Any strategy that reduces event size while preserving physics sensitivity must therefore identify and retain only the subset of objects compatible with the signal PV/SV, while rejecting contributions from other vertices and pileup activity.

This paper introduces an Inclusive Multivariate Isolation (IMI) algorithm specifically designed to address this association problem. Unlike classical isolation methods based on cones or vertexing, IMI evaluates and scores additional charged particles in the event, selecting only those most likely to originate from the same PV and decay chain as the signal candidate, while discarding the rest. As an illustrative example, consider the decay

[eqn]

In this example, the $[eqn]$ and $[eqn]$ constitute the base signal particles: they are selected with loose track-quality, PID, and vertex requirements to provide a minimal representation of the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b} $$\end{document}$ -hadron decay vertex. The $[eqn]$ is an additional non-isolated particle that belongs to the same decay chain and that IMI is designed to retain with high background rejection, thereby enabling the reconstruction of the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D^{*-}$$\end{document}$ state. By carefully selecting only the most relevant extra particles, IMI can significantly reduce the per-event size. Its design is therefore guided by the following key objectives:

Deliver better background rejection than classical isolation techniques, particularly in high-pileup environments where traditional algorithms tend to degrade.
Enable applications beyond standard combinatorial suppression, including:

Reconstruction of excited charm or charmless states involving particles from secondary decays, e.g. $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B^0 \rightarrow D^{*-} \mu ^+ \nu _\mu ,$$\end{document}$ essential for precision tests of lepton-flavour universality [10, 11];
Selection of data-driven control samples, such as the dominant background mode $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b \rightarrow \Lambda _c^+ \mu ^- \nu _\mu ,$$\end{document}$ to isolate suppressed signal decays like $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b \rightarrow p \mu ^- \nu _\mu ,$$\end{document}$ critical for probing CKM unitarity [12];
Reconstruction of excited beauty-hadron states involving particles consistent with originating from the PV associated with the beauty hadron of interest, e.g., kaons in $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B^*_{s2}(5840) \rightarrow B^+ K^-,$$\end{document}$ relevant for studies of lepton-flavour violation (LFV) [11] and relative branching-fraction measurements [13].

minimizing its impact on memory usage and on the overall event processing time (throughput), while remaining robust to variations in decay topology and kinematics. As part of this work, classical isolation techniques were also developed for Run 3, including cone-based and vertex-based isolation, which remain widely used in many physics selections. Figure 2 shows the Run 3 data flow, including throughput and event rates for the different streams, and indicates where the classical isolation and IMI algorithms operate: the classical isolation runs in HLT2, whereas IMI is applied conservatively at the Sprucing stage. The two approaches are independent, i.e., selection lines using IMI do not rely on candidates preselected with the classical tool. As of the 2025 data-taking period, about $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$30\%$$\end{document}$ of selection lines across the Turbo and Full streams use the classical isolation developed here, while IMI is employed in roughly $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$20\%$$\end{document}$ of Sprucing selection lines targeting the Full stream.Fig. 2. The LHCb data flow during Run 3, illustrating the throughput and event rates for different streams, and indicating where the classical and IMI isolation algorithms are applied. Note that none of the Spruce selection lines using IMI rely on candidates preselected with the classical isolation tool. The quoted event rates include all isolation-based selection lines, and the bandwidths correspond to the maximum stream allocations from Ref. [5]

The paper is organised as follows. Section 2 reviews classical isolation techniques, Sect. 3 describes the IMI algorithm, compares its performance with classical methods, and studies its impact on key offline kinematic observables. Section 4 details the implementation of the classical isolation variants and the IMI algorithm, and quantifies their effect on data reduction. Section 5 presents validation of the IMI algorithm using Run 3 data. Conclusions and outlook are given in Sect. 6.

Classical isolation algorithms

The forward geometry of the LHCb detector, combined with its focus on low transverse momentum physics, results in approximately $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(200)$$\end{document}$ reconstructible charged tracks per inelastic pp interaction at Run 3 instantaneous luminosity. Isolating the few tracks, typically $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(2$$\end{document}$ –7), that originate from a heavy flavour signal decay in this dense environment is therefore essential, both for physics performance and for reducing data volume.

Over the past decade, three complementary charged particle isolation strategies have been developed and routinely employed at LHCb, as well as at the general-purpose LHC experiments and in other high energy physics experiments.2 While traditionally used to isolate signal candidates, defined here as reconstructed candidates for the decay of interest, by suppressing nearby activity, these techniques can also be used in the complementary mode of selecting and retaining nearby, decay related (non-isolated) particles, and this work focuses on the latter. Accordingly, we refer to isolated background particles as those unlikely to be associated with the signal decay, and to non-isolated signal particles as those likely to be associated with it (see Fig. 3 for clarity). The three classical isolation strategies considered here are:

Track isolation: Particles clearly associated with a primary vertex (PV) are rejected by requiring a large impact-parameter significance with respect to all the reconstructed PVs in the event, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,PV}},$$\end{document}$ or an association with the same PV as the signal candidate (samePV flag). The impact parameter (IP) is defined as the minimum distance between the particle’s trajectory and the given PV, while

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \chi ^{2}_{\mathrm {IP\,wrt\,PV}} \end{aligned}$$\end{document}

quantifies the change in the vertex-fit $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}$$\end{document}$ when the track is included in the PV fit. It is also common to impose a second requirement: that the particle has a small impact-parameter significance with respect to the secondary vertex (SV) of the signal candidate, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textrm{IP}}\,\chi ^{2}_{\textrm{SV}}.$$\end{document}$ Simple cut-based implementations of this approach were first developed by the DØ [14] and CDF [15] experiments, followed by ATLAS [16] and CMS [17]. The method was later adopted at LHCb for rare decay searches [18, 19] and for flavour-tagging algorithms [20]. Multivariate extensions, incorporating additional topological and kinematic features, as well as track-type information (Long and VELO) [21], were first introduced in $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B^0_{(s)} \rightarrow \mu ^+ \mu ^-$$\end{document}$ analyses [22] and LFV studies [23], and they are also used in rare decay searches at ATLAS and CMS [24, 25]. A schematic representation of how the track isolation strategy works is shown in Fig. 3 (top).

Cone isolation: All reconstructed particles within a cone of radius

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} R = \sqrt{(\Delta \eta )^2 + (\Delta \phi )^2}, \qquad R \in [0.4,0.5], \end{aligned}$$\end{document}

around the momentum direction of the signal candidate are considered. Here, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta \eta $$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta \phi $$\end{document}$ are the differences in pseudorapidity and azimuthal angle, respectively, between the signal candidate and other particles produced in the pp collision, defined in the laboratory frame (z along the beam direction, y vertically upward, and x completing a right-handed coordinate system). Typical discriminants include the particle multiplicity inside the cone, the transverse momentum of the leading track $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_{\textrm{T}}^{\text {lead}},$$\end{document}$ and the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_{\textrm{T}}$$\end{document}$ asymmetry,

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} A_{p_{\textrm{T}}} = \frac{p_{\textrm{T}}(\text {sig}) - \left| \sum _{i \in \text {cone}} \vec {p}^{\,i}\right| _{\textrm{T}}}{p_{\textrm{T}}(\text {sig}) + \left| \sum _{i \in \text {cone}} \vec {p}^{\,i}\right| _{\textrm{T}}}, \end{aligned}$$\end{document}

where the sum runs over the three-momenta $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\vec {p}^{\,i}$$\end{document}$ of all particles in the cone, and the transverse component is taken only after the sum is formed. The variable approaches $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+1$$\end{document}$ for a perfectly isolated signal candidate. Cone-based variables are fast and robust, and have seen widespread adoption in electroweak, jet, and heavy-flavour analyses [26–29], as well as in $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tau $$\end{document}$ identification at ATLAS and CMS [30, 31]. However, their discriminating power tends to degrade in high-occupancy environments. Figure 3 (middle) provides a visual illustration of cone isolation.

Vertex isolation: Reconstructed tracks in the event that are not part of the signal candidate, but pass loose track-quality requirements, are tested for compatibility with the signal decay by fitting them together with the signal candidate to a common secondary vertex. If the combined fit yields a $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}/{\textrm{ndf}}$$\end{document}$ below a chosen threshold and the vertex is significantly displaced from the PV, the additional track is considered consistent with the decay chain. In fully reconstructed decays, a simple cut-based vertex isolation is often sufficient: any extra track compatible with the decay vertex typically indicates a partially reconstructed background, whereas genuine signal candidates are expected to have no such additional tracks. For partially reconstructed decays, however, missing final-state particles make this logic less direct, and multivariate strategies have therefore become standard [32, 33]. Finally, we note that the change in vertex-fit quality upon adding the extra track, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta \chi ^{2}_{\mathrm {SV\,fit}},$$\end{document}$ is highly correlated with the track impact-parameter significance with respect to the signal vertex, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,SV}},$$\end{document}$ as both quantify the compatibility of the track with the signal decay vertex; the two variables are thus often used interchangeably across analyses. A schematic illustration of the vertex-isolation strategy is shown in Fig. 3 (bottom). Fig. 3. Schematic illustration of the three classical isolation strategies: track isolation (top), based on large impact-parameter significance; cone isolation (middle), based on the track’s proximity to the signal candidate; and vertex isolation (bottom), based on compatibility with the signal decay vertex

Most LHCb analyses use classical techniques as standalone isolation tools, though in some cases two are combined: fast track-based isolation is typically deployed in the trigger, while the more computationally demanding vertex-based methods are applied offline, where their complementary strengths can be exploited. Each of the classical methods provides good signal efficiency, but none achieves the required background rejection in the denser Run 3 environment, where the average number of reconstructible tracks per event has tripled compared to Run 2 [1].

Inclusive multivariate isolation (IMI)

Due to the limitations of classical isolation algorithms under high-pileup conditions, a multivariate isolation tool was developed integrating key features from all the three traditional approaches into a single classifier. As illustrated in Eq. (2), this classifier assigns a score to each combination constructed from a few base particles and an extra particle, to determine whether the extra particle should be classified as isolated or non-isolated. Non-isolated particles (those close to base particles) are more likely to originate from the signal decay and are retained for further analysis, particularly for reconstructing excited charm or charmless states, or for defining background-enriched control samples. Conversely, the IMI score can also be used as a discriminating variable to suppress backgrounds, such as combinatorial and partially reconstructed decays.

To ensure compatibility with the computational constraints of the Sprucing framework, several lightweight machine learning (ML) algorithms were explored, including Multi-Layer Perceptrons (MLPs), Random Forests, and Gradient Boosted Decision Trees (GBDTs). The Extreme Gradient Boosting (XGBoost) algorithm [34] was selected for its optimal balance between classification performance, computational efficiency, interpretability and strong performance on tabular high-energy physics data.

In the following sub-sections, we describe the data samples used for the training and validation of the IMI model, the selection of input features, and the performance of the IMI tool compared to classical isolation methods.

Data samples

To ensure maximal inclusivity, the training sample consists of a broad class of simulated semileptonic beauty-hadron decays, generated under nominal Run 3 LHCb conditions with an instantaneous luminosity of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2\times 10^{33}\,\text {cm}^{-2}\text {s}^{-1},$$\end{document}$ corresponding to an average pile-up of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu =5.3$$\end{document}$ (visible interactions) or $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\nu =7.6$$\end{document}$ total proton–proton interactions per bunch crossing. The sample spans a variety of initial states, ( $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${B} ^0$$\end{document}$ ), ( $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^+} $$\end{document}$ ), ( $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${B} ^0_{s} $$\end{document}$ ), and ( $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Lambda } ^0_{b} $$\end{document}$ ), and final states containing electrons (( $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e ^\pm $$\end{document}$ )), muons (( $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upmu ^\pm $$\end{document}$ )), or taus (( $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\uptau ^\pm $$\end{document}$ )), covering a wide kinematic phase space to avoid biasing the algorithm toward specific kinematic properties of the signal decay. Throughout this paper, references to a decay mode implicitly include its charge-conjugate process. These decay chains also include both short-lived intermediate states (e.g., $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D^{*}_{(s)},$$\end{document}$ $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D^{**},$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _c^*)$$\end{document}$ and long-lived particles (e.g., $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{D} ^0},$$\end{document}$ $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tau )$$\end{document}$ , enabling the isolation tool to learn how to disentangle contributions from both secondary and tertiary vertices. This diversity is essential to ensure that the IMI algorithm generalises robustly to the complex topologies expected in real LHCb data. A summary of the simulated decay modes used to train IMI, together with the corresponding base and non-isolated particles, is given in Table 1 in Appendix A.

In all simulated decays, the base particles are required to be reconstructed as Long tracks and are selected with loose vertex, track quality and PID criteria to ensure they originate from a common decay vertex and are consistent with signal decays. The IMI tool currently targets non-isolated Long and Upstream tracks. Downstream tracks, which make up a relatively small fraction of all reconstructed tracks, were excluded due to their poorer momentum and vertex resolution, which limit their contribution to isolation performance. VELO-only tracks have no momentum measurement and are therefore not typically used as analysis objects, so they were not included in the baseline IMI. Their main value is as an extra handle for very rare decays by capturing additional charged activity near the b-hadron decay region in the VELO; since VELO tracks are already persisted, they can be added to the IMI training in a future iteration to further improve background rejection if bandwidth becomes a limiting factor.

Signal and background classes

The goal of the IMI tool is to identify charged particles that genuinely originate from the decay of a heavy-flavour hadron, while rejecting the far more numerous background particles unrelated to the signal decay. Signal particles are defined as those produced in the decay of a beauty hadron, which typically travels about $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim 1\,{\textrm{cm}}$$\end{document}$ from the PV before decaying within the LHCb acceptance. These signal particles fall into two categories:

those produced at the displaced secondary and tertiary vertex of the b-hadron decay, and
those originating from prompt strong decays of excited beauty states, such as $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B_{s2}(5840)\!\rightarrow \!{{{B} ^+}} {{K} ^-},$$\end{document}$ where the kaon is emitted at the PV, while the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{B} ^+}} $$\end{document}$ continues to travel and decays further downstream. The background consists of all other charged particles in the event that are kinematically and topologically uncorrelated with the signal decay. These include:
(c)particles produced at or near the PV, predominantly from the soft–QCD hadronisation of light quarks and gluons in the underlying event.
(d)particles originating from the decay of the second b-hadron in the event, which often forms the dominant combinatorial background due to their similar displaced-vertex signatures but lack of correlation with the signal vertex. An illustration of the definitions of signal and background particles, defined in the above manner, is shown in Fig. 4.

An architecture was briefly explored categorising outputs into four classes, designed to disentangle all the above categories using inclusive $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b\bar{b}$$\end{document}$ simulation samples [35]. In practice, the multiclass network delivered no measurable gain: the limited statistics available for categories (b) and (d) prevented the model from learning boundaries more precise than those already captured in a simpler binary formulation. Therefore, a two-class IMI classifier was adopted, optimised to separate long-lived b-hadron decays from background. Prompt decays of excited b-hadrons are instead treated with a dedicated cut-based selection, described in Appendix B. Future versions, trained on larger and more diverse simulated samples, may revisit the multiclass strategy to exploit finer distinctions among these categories.Fig. 4. Illustration of typical signal- and background-like topologies relevant to the IMI tool. The depicted $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B_s^{*} \rightarrow B K$$\end{document}$ decay is shown as an illustrative example and was used only in the development of the cut-based method. In the left diagram, the signal candidates include non-isolated particles originating from the decay vertex of a b-hadron or its higher excited state. In the right diagram, the background candidates are defined as those formed by pairing with particles originating from the primary vertex or other b-hadrons in the eventFig. 5Distributions of the input features used to train the IMI. The curves for signal particles (non-isolated particles) are shown as red histograms, while background particles (isolated particles) are represented by green histograms. These variables serve as inputs to the multivariate classifier. See Sect. 3.3 for a detailed description of each feature

Input features

The IMI tool was trained on a feature set motivated by classical isolation algorithms and then iteratively refined to maximise signal–background separation while reducing redundancy and limiting correlations with key physics observables (e.g. $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q^2)$$\end{document}$ . Variables providing only marginal gains in high-occupancy (Run 3-like) conditions were removed, while robust features that generalise across exclusive channels were retained. For readability, we use standard abbreviations throughout: PV (primary vertex), SV (secondary vertex), IP (impact parameter), DOCA (distance of closest approach), POCA (point of closest approach), and DIRA (direction angle between momentum and flight direction).

IP significance wrt PV $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\mathbf {\chi ^2_{\mathrm {IP\,wrt.\,PV}}})$$\end{document}$ : IP $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^2$$\end{document}$ of the additional particle with respect to the PV; it separates PV-produced (prompt) tracks from displaced tracks originating in long-lived b-hadron decays.
IP significance wrt SV $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\mathbf {\log (\chi ^2_{\mathrm {IP\,wrt.\,SV}})})$$\end{document}$ : As above, but computed with respect to the SV of the base candidate. Signal-like particles are compatible with the SV (small values), whereas unrelated tracks (including from the other b hadron) tend to be less compatible; tertiary decays (e.g. charm, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tau )$$\end{document}$ populate intermediate values.
Cone separation (Transformed $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf {\Delta R})$$\end{document}$ : $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta R = \sqrt{(\Delta \eta )^2 + (\Delta \phi )^2}$$\end{document}$ between the additional and base particles in the laboratory frame (see Sect. 2 for definition). Signal-like particles are typically close in angle; to enhance discrimination at small angles we use $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\Delta R)^{0.2}.$$\end{document}$
Flight-direction alignment (Transformed DIRA): With $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{p}$$\end{document}$ the momentum direction of the refitted candidate (base+track) and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{FD}$$\end{document}$ the PV $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rightarrow $$\end{document}$ SV flight direction, the direction angle (DIRA) (see Fig.3.4 of Ref [36]) is defined as

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \cos \alpha = \hat{p}\cdot \hat{FD}. \end{aligned}$$\end{document}

True b-hadron candidates are well aligned $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\cos \alpha \rightarrow 1)$$\end{document}$ ; we use $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(1-\cos \alpha )^{0.2}$$\end{document}$ to emphasise small misalignments.

SV shift (SV displacement): To quantify how compatible the additional track is with the base candidate’s decay vertex, we refit the base candidate together with the additional track and compare the fitted SV positions. We define

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} & \Delta \vec r_{\textrm{SV}} \equiv \vec r_{\textrm{SV}}^{\,\text {base+trk}} - \vec r_{\textrm{SV}}^{\,\text {base}},\\ & d_{\textrm{SV}}^{\textrm{signed}} \equiv \operatorname {sign}(\Delta r_{{\textrm{SV}},z})\,|\Delta \vec r_{\textrm{SV}} |, \end{aligned}$$\end{document}

where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\vec r_{\textrm{SV}}^{\,\text {base}}$$\end{document}$ is the decay vertex of the base candidate and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\vec r_{\textrm{SV}}^{\,\text {base+trk}}$$\end{document}$ is the decay vertex after adding the extra track; $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta r_{{\textrm{SV}},z}$$\end{document}$ is the z-component of the displacement. For correctly associated tracks (signal-like), the refit leaves the SV essentially unchanged, so $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{\textrm{SV}}^{\textrm{signed}}$$\end{document}$ peaks sharply at 0, with a width dominated by the SV-fit resolution. For misassociated tracks (typically prompt PV tracks), the common-vertex fit must compromise between displaced decay tracks and a prompt line pointing back to a PV; this frequently pulls the refitted SV upstream in z-direction (along the beam axis), yielding a broader distribution with an enhanced negative tail $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(d_{\textrm{SV}}^{\textrm{signed}}<0)$$\end{document}$ , although positive shifts are possible depending on the geometry and track covariances.

DOCA significance $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\mathbf {\log \chi ^2_{\textrm{DOCA}}})$$\end{document}$ : The DOCA (distance of closest approach) between two reconstructed objects is the minimum spatial separation of their trajectories. We use the significance of this separation, expressed as

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \log \chi ^2_{\textrm{DOCA}}, \end{aligned}$$\end{document}

where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^2_{\textrm{DOCA}}$$\end{document}$ can be interpreted as the squared separation at the POCA (point of closest approach), weighted by the associated covariance matrices; equivalently, it corresponds to the increase in vertex-fit $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^2$$\end{document}$ when constraining the two objects to originate from a common vertex. When the base object is a track, both objects are approximated locally as straight lines using their fitted positions and directions in the vicinity of the POCA. When the base object is composite (i.e. decays to two or more particles), it is represented by its flight line from the composite decay vertex along its reconstructed momentum. This variable provides good discrimination: tracks produced at (or very near) the SV yield small values, tracks from displaced intermediate decays (e.g. charm or $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tau )$$\end{document}$ tend to populate intermediate values, and unrelated combinations (typically prompt PV activity) give the largest values.

Transverse momentum $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\mathbf {\log (}{\textbf{p}}_{\textbf{T}} \mathbf{)})$$\end{document}$ : $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log (p_T)$$\end{document}$ of the additional particle. It helps reject soft-QCD PV activity; despite some kinematic dependence, its discriminating power motivates inclusion.
Signed SV–PV flight distance: For the refitted candidate (base+track), we define the signed displacement between its SV and the associated best PV as

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} & \Delta \vec r_{\textrm{PV}} \equiv \vec r_{\textrm{SV}}^{\,\text {base+trk}} - \vec r_{\textrm{PV}},\\ & d_{\textrm{PV}}^{\textrm{signed}} \equiv \operatorname {sign}(\Delta r_{{\textrm{PV}},z})\,|\Delta \vec r_{\textrm{PV}} |, \end{aligned}$$\end{document}

where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\vec r_{\textrm{PV}}$$\end{document}$ is the position of the best PV (chosen as the PV that minimises the candidate’s IP with respect to the refitted candidate momentum), and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta r_{{\textrm{PV}},z}$$\end{document}$ is the z-component of the SV–PV displacement. For genuine long-lived decays, the SV lies downstream of the PV, so $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{\textrm{PV}}^{\textrm{signed}}$$\end{document}$ is predominantly positive, with a tail governed by the b-hadron boost. For background combinations formed by attaching a prompt PV track to a genuine displaced base candidate, the refit tends to pull the SV toward the prompt track’s PV. Two typical behaviours follow: (i) if the refitted SV shifts upstream while the best-PV association remains downstream, then $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$z_{\textrm{SV}}^{\text {base+trk}}<z_{\textrm{PV}}^{\text {best}}$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{\textrm{PV}}^{\textrm{signed}}$$\end{document}$ becomes strongly negative, generating the pronounced negative tail; (ii) if the best PV flips to the prompt track’s PV, the SV lies close to that PV and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{\textrm{PV}}^{\textrm{signed}}$$\end{document}$ populates the near-zero (or small positive) region. These effects are amplified in multi-PV environments, where PV–PV separations along z-direction can be several millimetres.

Momentum opening angle (Transformed $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf {\cos \theta })$$\end{document}$ : Cosine of the opening angle between the base and additional-particle momenta. Signal particles are more aligned (large $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\cos \theta )$$\end{document}$ ; we use $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1-(1-\cos \theta )^{0.2}$$\end{document}$ to enhance separation where the distributions overlap. The distributions of the input features for signal and background particles are shown in Fig. 5, with the corresponding correlation matrix provided in Appendix C. An anti-correlation is observed between the Transformed $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta R$$\end{document}$ and Transformed $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\cos \theta $$\end{document}$ variables for both signal and background categories, along with a moderate correlation between $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log (\chi ^2_{\mathrm {IP\,wrt\,SV}})$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log (\chi ^2_{\textrm{DOCA}}).$$\end{document}$ Apart from these, the features exhibit only weak correlations, suggesting that they offer largely complementary information for the classification task. The relative importance of each feature is discussed in the following section.

Training and performance

On average, each simulated decay contains $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(3-4)\times 10^{4}$$\end{document}$ non-isolated signal particles, accompanied by a substantially larger number of isolated background particles – typically 50 to 100 times more numerous than the signal. To train the IMI classifier, each sample is split into three statistically independent subsets: 70% for training, 15% for validation, and 15% for evaluation. During training and validation, the background class is randomly down-sampled to match the number of signal particles, enforcing a balanced class ratio that stabilises gradient updates. In contrast, the evaluation set retains the full class imbalance $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\gtrsim 50{:}1)$$\end{document}$ , thus providing a realistic performance estimate under Run 3-like conditions at LHCb.

The classifier performance of IMI is quantified by the area under the receiver-operating-characteristic (ROC) curve, abbreviated AUC. Hyperparameters, including the number of trees, maximum tree depth, and learning rate, are tuned using the Bayesian optimisation framework Optuna [37]. The objective is to minimise the Kolmogorov–Smirnov (KS) statistic between the classifier outputs on the training and validation sets, thereby suppressing overfitting. The KS statistic obtained from the comparison between the training and validation samples at the optimal hyperparameter point is $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k = 8.5\times 10^{-4}.$$\end{document}$ This very small value indicates that the output-score distributions for the training and validation samples are nearly indistinguishable, implying that the classifier generalises well and that no statistically meaningful overtraining is observed.

The IMI output for non-isolated (signal) and isolated (background) particles is shown in Fig. 6. In this figure, the filled histograms represent the training sample, while the markers indicate the evaluation sample. The close agreement between the two confirms that the model generalises well and exhibits no signs of overtraining. Quantitatively, the classifier achieves an AUC of

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {AUC} = 0.9964(3) \end{aligned}$$\end{document}

on the evaluation set, where the uncertainty reflects the variation across a thousand bootstrap replicas.Fig. 6. Distributions of the IMI classifier output for non-isolated signal (red) and isolated background (green) particles in the training sample (filled histograms) and evaluation sample (markers). The strong agreement confirms the absence of overtraining and the model’s ability to generalise to unseen data

To elucidate the classifier’s internal logic, we compute SHAP (SHapley Additive exPlanations) values [38], which quantify the contribution of each feature to the model output on a per-particle basis. The summary plot in Fig. 7 orders the inputs by their mean impact and visualises their effects across the evaluation sample. Three variables clearly dominate the decision boundary: $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log (\chi ^{2}_{\mathrm {IP\,wrt\,SV}}),$$\end{document}$ $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,PV}},$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta R.$$\end{document}$ Particles with small values of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log (\chi ^{2}_{\mathrm {IP\,wrt\,SV}})$$\end{document}$ or $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta R$$\end{document}$ (green points) push the SHAP value toward positive numbers, yielding a signal-like prediction, whereas large values (red points) shift the output toward background-like. Conversely, a large $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,PV}}$$\end{document}$ indicates significant displacement from the primary vertex and therefore increases the signal score, while small values suppress it, complementary behaviour to the other two variables. Secondary inputs, including the Transformed DIRA, SV displacement, and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log (\chi ^{2}_{\textrm{DOCA}}),$$\end{document}$ introduce fine-grained topological information that sharpens the separation between classes. Kinematic observables such as $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log p_{T}$$\end{document}$ play a supportive, albeit less critical, role. The narrow SHAP ranges observed for Signed flight distance and the Transformed $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\cos \theta $$\end{document}$ demonstrate that the model does not over-rely on weakly informative features. Overall, the SHAP analysis confirms that the classifier’s decisions are governed by physically meaningful observables and remain fully aligned with the underlying isolation logic.Fig. 7SHAP summary plot for the IMI classifier [38]. The y-axis lists input features in order of decreasing importance. The x-axis indicates each feature’s SHAP value, that is, its contribution to the classifier’s output, while the color encodes the raw feature value (green = low, red = high). Positive SHAP values push the classifier toward signal-like predictions, while negative values indicate background-like behaviour

The benchmarking of the IMI tool against classical isolation methods is provided in Sect. 3.4.1, while its performance as a function of event multiplicity is discussed in Sect. 3.4.2. The dependence of signal efficiency on key kinematic variables is examined in Sect. 3.4.3.

Comparison with classical isolation

The performance of the IMI tool is benchmarked against classical isolation techniques described in Sect. 2, namely, the track isolation, cone isolation and vertex isolation methods. For the comparisons presented here and throughout the paper, we use standard working requirements for each method (see Sect. 2 for variable definitions):

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {Track isolation:} \quad&\chi ^{2}_{\mathrm {IP\,wrt\,PV}} > x \quad \text {or} \quad \texttt {samePV} = \text {True}, \end{aligned}$$\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {Cone isolation:} \quad&\Delta R < y, \end{aligned}$$\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {Vertex isolation:} \quad&\chi ^{2}_{\mathrm {IP\,wrt\,SV}} < z, \end{aligned}$$\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {IMI:} \quad&\text {IMI score} > a. \end{aligned}$$\end{document}

These requirements for the classical methods are chosen because they are widely deployed online and provide a robust balance between signal efficiency, background rejection, and throughput cost. While channel-specific offline analyses may combine multiple requirements for additional gains, our goal here is a like-for-like comparison using a baseline representative of general-purpose isolation used at the trigger/Spruce level.

Figure 8 shows the ROC curves obtained from the inclusive simulated samples listed in Table 1. In these plots we report the signal candidate efficiency, defined as the probability that all particles forming the reconstructed signal candidate are retained after applying the isolation requirement. This differs from the signal track efficiency, which counts the retention of individual signal particles. Candidate-level efficiency is therefore lower, but is the more relevant metric for analyses that reconstruct full decay chains. By contrast, the background rejection is defined with respect to background tracks, as these particles do not form part of any reconstructed candidate. The results demonstrate that IMI provides the best overall performance, reaching about $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99\%$$\end{document}$ background rejection at $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$95\%$$\end{document}$ signal candidate efficiency. Beyond this point, the curve drops steeply, indicating a sharp and well-defined operational threshold. The cone isolation method, which accepts extra particles within a maximum angular separation $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta R < x$$\end{document}$ from the base particle, can approach similar maximum background rejection, but only in a much narrower efficiency range $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\varepsilon _{\text {sig}} \approx 0$$\end{document}$ – $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$50\%)$$\end{document}$ . This reflects a key limitation of relying solely on geometric separation: particles from the dense underlying event can mimic the topology of genuine signal tracks, reducing discrimination power at higher efficiencies. The track isolation method, based on requiring either the samePV association with the base particle or $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,PV}} > y,$$\end{document}$ is constrained by the binary nature of the samePV selection. Once most background particles are assigned to the correct primary vertex, its rejection power saturates at about $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$70\%,$$\end{document}$ and performance degrades quickly for $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _{\text {sig}} \gtrsim 95\%.$$\end{document}$ The vertex isolation method, based on requiring compatibility with the reconstructed secondary vertex (e.g. $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,SV}} < z)$$\end{document}$ , achieves substantially stronger rejection at high signal efficiencies than the PV-based track isolation, but remains limited by the resolution and stability of the reconstructed secondary vertex and therefore turns over earlier than IMI. Overall, the results highlight that IMI leverages a richer set of features beyond pure geometry or simple PV/SV association, enabling it to maintain both high background rejection and high signal efficiency over a wide operating range. The signal efficiency and background rejection power for all four methods as function of the threshold values is shown in the Appendix D.

The four lower panels (clockwise from top-left) of Fig. 8 explore the performance of the track, cone, vertex and IMI isolation methods when applied to three exclusive decay channels with varying kinematics and numbers of non-isolated signal particles (see Table 1):

$\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0} \rightarrow D^{*-}\mu ^{+}\nu _{\mu },$$\end{document}$ featuring one non-isolated signal particle with soft kinematics;
$\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b\!\rightarrow \!{{\Lambda } ^+_{c}} ^*\mu ^{-}\bar{\nu }_{\mu },$$\end{document}$ featuring two relatively hard non-isolated signal particles;
$\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0_{s}} \!\rightarrow \!D_s^{(*)-}\,\ell ^{+}\nu _{\ell },$$\end{document}$ with two to five non-isolated particles, including children of a long-lived $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D_s^-.$$\end{document}$ Across all three benchmark channels, IMI maintains a robust and consistent performance, demonstrating that it is largely agnostic to the number of non-isolated signal particles and their kinematics in these modes. In contrast, the track isolation method shows similarly modest performance for each decay, underscoring that its discriminating power is governed almost entirely by the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,PV}}$$\end{document}$ threshold and saturates once most background tracks are assigned to their correct PV; minor differences in signal efficiency arise from correlations between particle kinematics and impact-parameter significance. Cone isolation exhibits the strongest dependence on the decay topology: for $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0} \rightarrow D^{*-}\mu ^{+}\nu _{\mu },$$\end{document}$ where the relevant signal activity is relatively collimated, the method performs comparatively well, whereas for topologies with multiple and/or softer non-isolated particles (notably the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D_s^{(*)}$$\end{document}$ mode) the performance degrades markedly. This reflects the need for a larger cone to retain all signal-side particles, which unavoidably admits more background. The vertex isolation method, based on requiring compatibility with the reconstructed secondary vertex, provides substantially stronger rejection than PV-based track isolation and shows only mild channel dependence for the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0} $$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b$$\end{document}$ benchmarks. A more pronounced degradation is observed for $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0_{s}} \!\rightarrow \!D_s^{(*)-}\ell ^{+}\nu _{\ell },$$\end{document}$ consistent with the presence of additional displaced decay structure from the long-lived $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D_s^-$$\end{document}$ : maintaining high candidate efficiency in this case requires a looser SV-compatibility requirement, which reduces background rejection.

The IMI working point is intentionally set to a conservative response threshold of 0.05. At this setting, the number of selected signal particles per event is reduced from the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(200)$$\end{document}$ charged particles typically reconstructed to about $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(10),$$\end{document}$ while retaining a signal efficiency of roughly $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99\%.$$\end{document}$ Although exclusive trainings generally outperform the inclusive ones, as they are optimised for a specific signal topology and kinematics, the low threshold ensures that a broad set of signal-like particles is retained. This allows analysts to perform more specialised trainings at the analysis stage while still benefiting from strong background rejection.Fig. 8. Signal candidate efficiency versus background track rejection for the Track (teal), Cone (orange), Vertex (light blue), and IMI (dark blue) isolation methods. The top panel shows the performance on the inclusive MC sample; the red marker indicates the chosen IMI working point. The four lower panels (clockwise from top-left) show Track, Cone, IMI, and Vertex isolation, each evaluated on the inclusive sample (solid) and on three exclusive benchmark channels with increasing numbers of non-isolated signal particles (see Table 1): $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0} \rightarrow D^{*-}\mu ^{+}\nu _{\mu }$$\end{document}$ (1 non-isolated particle), $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b\!\rightarrow \!{{\Lambda } ^+_{c}} ^*\mu ^{-}\bar{\nu }_{\mu }$$\end{document}$ (2 non-isolated particles), and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0_{s}} \!\rightarrow \!D_s^{(*)-}\,\ell ^{+}\nu _{\ell }$$\end{document}$ (2 to 5 non-isolated particles), where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \in \{{\upmu ^\pm },{\uptau ^\pm } \}$$\end{document}$

Performance as a function of event multiplicity

We evaluate the robustness of different isolation methods in increasingly busy events by studying their performance as a function of event multiplicity, defined as the number of reconstructed charged particles in the event. This variable serves as a proxy for overall event activity, which is particularly relevant in hadronic collisions where occupancy can vary substantially. Each method is benchmarked at a fixed signal efficiency of approximately $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99\%,$$\end{document}$ and its background rejection power is assessed across the multiplicity spectrum.

Figure 9 compares four isolation strategies, track isolation, cone isolation, vertex isolation, and IMI, as a function of event multiplicity, using the inclusive simulation sample described in Table 1. Across the full range of event multiplicities, the IMI algorithm demonstrates a clear advantage, consistently rejecting around $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$95\%$$\end{document}$ of background while retaining $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99\%$$\end{document}$ signal efficiency. The track isolation method shows steadily improving performance with increasing multiplicity, reaching up to $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$50\%$$\end{document}$ background rejection in high-occupancy events. This improvement arises because the isolation requirement is particularly effective at rejecting tracks originating from primary vertices other than the one that produced the signal candidate. The vertex isolation method also improves with event multiplicity and provides systematically stronger rejection than PV-based track isolation, reaching about $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$60\%$$\end{document}$ in the highest-multiplicity events. This reflects the additional constraint of compatibility with the reconstructed secondary vertex: in busy events a larger fraction of unrelated tracks is inconsistent with the candidate SV and is therefore rejected, while the SV resolution remains sufficiently stable to maintain the fixed $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99\%$$\end{document}$ signal efficiency working point. In contrast, the cone isolation method yields relatively flat performance, plateauing at around $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$20\%$$\end{document}$ background rejection. Its effectiveness is significantly lower than the other methods, especially in high-multiplicity environments where the isolation cone is more likely to contain unrelated particles.

The lower panels of Fig. 9 present the same comparison for three exclusive benchmark channels, introduced earlier, that differ in kinematics and in the number of non-isolated particles: $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0} \!\rightarrow \!D^{*-}\mu ^{+}\nu _{\mu }$$\end{document}$ (one soft), $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b\!\rightarrow \!{{\Lambda } ^+_{c}} ^*\mu ^{-}\bar{\nu }_{\mu }$$\end{document}$ (two relatively hard), and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0_{s}} \!\rightarrow \!D_s^{(*)-}\,\ell ^{+}\nu _{\ell }$$\end{document}$ (two to five, including daughters of a long-lived $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D_s^-)$$\end{document}$ . Across all three channels, IMI achieves the best performance, delivering 90– $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$95\%$$\end{document}$ background rejection while maintaining $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99\%$$\end{document}$ signal efficiency. The track isolation method performs similarly for the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0} $$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0_{s}} $$\end{document}$ decays, but slightly worse for the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b$$\end{document}$ decay. This difference arises because the relatively soft non-isolated particles in $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0} $$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0_{s}} $$\end{document}$ decays tend to have larger impact parameters with respect to the PV than the harder particles from $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b$$\end{document}$ decays, settling on a minimal $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,PV}}$$\end{document}$ requirement that is more effective at rejecting background in the former cases. The vertex isolation method provides strong rejection for the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0} $$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b$$\end{document}$ benchmarks and is comparatively stable across multiplicity, but degrades significantly for the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0_{s}} $$\end{document}$ mode. This behaviour is consistent with the presence of additional displaced decay structure from the long-lived $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D_s^-$$\end{document}$ : to retain $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99\%$$\end{document}$ candidate efficiency, the SV-compatibility requirement must be loosened to accommodate signal tracks originating from the downstream $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D_s^-$$\end{document}$ decay, which in turn admits more background and reduces the achievable rejection power. In contrast, the cone isolation method gives comparable results for the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0} $$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b$$\end{document}$ decays, but performs significantly worse for the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0_{s}} $$\end{document}$ decay. For $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0_{s}} $$\end{document}$ decays, reconstructing signal particles from long-lived $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D_s^-$$\end{document}$ mesons requires a larger isolation cone, which inevitably captures more background and reduces rejection power.Fig. 9. Background rejection at a fixed signal efficiency of approximately $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99\%$$\end{document}$ for the Track (teal), Cone (orange), Vertex (light blue), and IMI (dark blue) isolation methods as a function of the number of reconstructed charged particles in the event (event multiplicity). The top panel shows performance on the inclusive simulated sample (see Table 1). The bottom panels show the performance for three exclusive decay channels with varying kinematics and numbers of non-isolated signal particles: $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0} \rightarrow D^{*-}\mu ^{+}\nu _{\mu }$$\end{document}$ (1 non-isolated particle, left), $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b\!\rightarrow \!{{\Lambda } ^+_{c}} ^*\mu ^{-}\bar{\nu }_{\mu }$$\end{document}$ (2 non-isolated particles, middle), and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0_{s}} \!\rightarrow \!D_s^{(*)-}\,\ell ^{+}\nu _{\ell }$$\end{document}$ (2 to 5 non-isolated particles, right), where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \in \{{\upmu ^\pm },{\uptau ^\pm } \}.$$\end{document}$ The IMI tool consistently outperforms the classical methods across all event multiplicities and decay channels

Signal efficiency as a function of kinematic variables

We also assess whether the signal efficiency of each isolation method varies as a function of key kinematic variables, particularly the squared four-momentum transfer, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q^2 = (p_B - p_{\text {had}})^2,$$\end{document}$ which is equivalently the invariant mass squared of the lepton–neutrino system (e.g. $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q^2 = m^2_{\ell \nu _\ell }$$\end{document}$ for $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{B} ^0} \rightarrow D^{*-} \ell ^{+} \nu _{\ell }$$\end{document}$ decays). Ideally, a well-designed selection should yield a flat efficiency across the entire $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q^2$$\end{document}$ spectrum. This is especially important at high $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q^2,$$\end{document}$ where theoretical predictions from lattice QCD are most precise, enabling precision measurements of CKM matrix elements.

To study this, we evaluate the signal efficiency as a function of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q^2$$\end{document}$ for each isolation method. For the IMI tool, we apply the nominal working point defined earlier, corresponding to a IMI output threshold of IMI $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$> 0.05$$\end{document}$ on each signal particle. For track isolation, we use the samePV flag or $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,PV}} > 16.$$\end{document}$ For cone isolation, we use a fixed cone size of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta R = 0.5$$\end{document}$ which is typically used to capture the b-jet structure. For vertex isolation, we use a common secondary-vertex compatibility requirement of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,SV}} < 1.$$\end{document}$ Figure 10 shows the resulting signal efficiencies for two representative decay channels: $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{B} ^+}} \!\rightarrow \!D^{*-}\mu ^{+}\nu _{\mu }$$\end{document}$ (left), with one non-isolated signal particle, and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b\!\rightarrow \!{{\Lambda } ^+_{c}} ^*\mu ^{-}\bar{\nu }_{\mu }$$\end{document}$ (right), with two non-isolated particles, where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _c^*$$\end{document}$ denotes the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _c(2625)$$\end{document}$ state. Cone, track, and vertex isolation show modest $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q^2$$\end{document}$ -dependent variations – reflecting their intrinsic correlations with the signal kinematics – whereas the IMI tool retains a consistently high and nearly flat efficiency across the full $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q^2$$\end{document}$ range.Fig. 10. Signal efficiency as a function of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q^2$$\end{document}$ for different isolation methods: IMI with output $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$> 0.05$$\end{document}$ (blue), track isolation with samePV and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,PV}} > 16$$\end{document}$ (teal), cone isolation with a fixed cone size of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta R = 0.5$$\end{document}$ (orange), and vertex isolation with $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,SV}} < 1$$\end{document}$ (light blue). Shown are two decay channels: $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{B} ^+}} \!\rightarrow \!D^{*-}\mu ^{+}\nu _{\mu }$$\end{document}$ (left), with one non-isolated signal particle, and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _b\!\rightarrow \!{{\Lambda } ^+_{c}} ^*\mu ^{-}\bar{\nu }_{\mu }$$\end{document}$ (right), with two. Here $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _c^*$$\end{document}$ denotes the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _c(2625)$$\end{document}$ stateFig. 11Comparison of the two classical isolation strategies implemented in the LHCb selection framework. Although type “B” is designed for the trigger, it can also be run offline, ensuring consistent isolation variable computation across channels using different strategies

Implementation and data-size reduction

This section describes the implementation of both the classical isolation algorithms and the new IMI algorithm, and their integration into the LHCb software framework in Sect. 4.1. The data-size reduction achieved by the IMI and its impact on Sprucing throughput are discussed in Sect. 4.2.

Integration into the LHCb selection framework

The computation of classical isolation variables, introduced in Sect. 2, is implemented in the LHCb software through two complementary workflows, illustrated in Fig. 11.

Classical isolation A: In this implementation, combinations are formed between the reconstructed base particles and all other charged particles in the event that could contribute to the isolation assessment. The selected additional particles are persisted for offline use, where cone- and vertex-based observables are computed by explicitly fitting a vertex between the base and each extra particle. These observables can then be stored [39] and used directly in offline analyses. The corresponding algorithms are implemented within the Rec reconstruction framework [40].Fig. 12. Isolation workflow based on the Inclusive Multivariate Isolation (IMI) approachFig. 13(Left) Relative reduction in output file size (blue) and relative inference throughput (red) as a function of the IMI threshold. The throughput is evaluated in steps of 0.01 and, for display, the median is shown in bins of width 0.2; the shaded band indicates the standard deviation. Throughput is measured across all production LHCb sprucing lines relative to a baseline in which isolation is ignored and all extra particles are persisted. As expected, the throughput is only weakly dependent on the IMI threshold, since the candidate combinations (and associated vertexing) are constructed independently of this cut and the threshold primarily controls how much content is written out. (Right) File-size reduction versus signal-candidate efficiency for the inclusive simulation sample. The nominal working point is indicated by the vertical dashed line and the red marker at $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\textrm{IMI}}}=0.05$$\end{document}$

Classical isolation B: The second implementation follows the same principle as the first, but computes the isolation variables directly at the trigger level. Instead of persisting the full information of the additional particles, only a minimal set of observables is stored, which significantly reduces the average event size. To optimise throughput in the case of vertex isolation, no dedicated vertex fit is performed; rather, the observables are derived from the impact-parameter significance of the extra particles with respect to the vertex formed by the base particles. This approach is well motivated, since the impact-parameter $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^2$$\end{document}$ with respect to the decay vertex and the change in the vertex-fit $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^2$$\end{document}$ when adding the extra track are found to be equivalent. The resulting quantities are written to the event record and can be used directly in offline analyses. The corresponding algorithm is implemented within the Rec reconstruction framework [40].Fig. 14. Distributions of (top) $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta R$$\end{document}$ and (bottom) $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log \!\bigl (\chi ^{2}_{\mathrm {IP\,w.r.t.\,SV}}\bigr )$$\end{document}$ for extra particles in partial Run 3 data (left) and simulation (right) of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B^{0}\!\rightarrow D^{*-}\ell ^{+}\nu _{\ell }$$\end{document}$ candidates. The samples are split by the IMI rank of the extra particle (highest, second-highest, and lowest IMI score)

Inclusive multivariate isolation: The IMI strategy follows a conservative data-storage model. At the trigger level, all reconstructed particles in the event are written to tape, and the actual isolation decision is deferred to the Sprucing stage, where the fully reconstructed event is available. At this point, only those additional particles identified as signal-like are retained on disk, leading to a significant reduction in event size. As illustrated in Fig. 12, each signal candidate is combined with every other particle in the event, and a vertex fit is attempted. To avoid an excessive number of combinations, loose fiducial cuts are applied to the additional particles, such as $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log \bigg (\chi ^2_{\mathrm {IP\,w.r.t.\,SV}}\bigg ) < 5,$$\end{document}$ a minimum signed flight distance greater than $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-3\,{\textrm{mm}},$$\end{document}$ and a minimum SV displacement greater than $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-5\,{\textrm{mm}}.$$\end{document}$ A multivariate classifier then evaluates each base-extra particle pair and assigns an IMI score. The resulting relation table, linking base particles to additional particles with their IMI scores, provides a flexible input for offline analyses. It can be used to suppress backgrounds, to rank extra particles when reconstructing complex decay chains, or to define background-enriched control regions. Again all the related algorithms form part of the Rec project [40].

Data-size reduction and throughput

Before integrating the IMI algorithm into the LHCb selection framework, it was essential to demonstrate that it could deliver a substantial reduction in the volume of data written to disk, without compromising on the processing throughput. To evaluate this trade-off, we processed a large sample of minimum-bias simulation events generated under Run 3 conditions, with all semileptonic Sprucing selection lines enabled. These lines typically reconstruct on average two b-hadron candidates and write minimal information to disk.

For each combination of a base particle and an extra particle in the event, the maximum IMI score is evaluated, and the combination is retained only if this score exceeds a configurable threshold. By scanning the threshold from 0 to 1 in steps of 0.01, we map out the relationship between (i) the relative reduction in output file size, (ii) changes in processing throughput, and (iii) the signal-candidate efficiency. The results are presented in Fig. 13. The left panel shows the relative file-size reduction (blue) and throughput variation (red). The filesize measurement is normalised to the baseline where no cut is applied (i.e., all additional particles are retained) and throughput is normalised relative to if all particles are saved without running the IMI at all. The right panel shows the direct trade-off between file-size reduction and signal-candidate efficiency, where the signal efficiency is the one obtained using the inclusive simulation sample (Table 1). In these figures, the reported file-size reductions are evaluated after applying a common set of loose pre-cuts, described in Sect. 4.1. These pre-cuts are applied uniformly in all configurations, including the “no-isolation” baseline, and serve only to remove pathological vertex combinations (e.g. extreme or ill-defined SV-association values). They are intentionally chosen to be essentially lossless for signal candidates $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\gtrsim 99\%)$$\end{document}$ while rejecting only a small fraction of background tracks $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\lesssim 10--15\%)$$\end{document}$ , so their standalone impact on the overall file-size reduction is minimal compared with the effect of the isolation requirements studied here.

At the nominal working point, defined by $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textrm{IMI}} > 0.05,$$\end{document}$ the output file size is reduced by 45%, while preserving more than $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99\%$$\end{document}$ of signal candidates. It is worth noting that the maximum achievable file-size reduction asymptotes at roughly 50%, as the remaining portion consists of indispensable reconstructed content such as primary vertices, base particles, neutral objects, and trigger information (see Fig. 1). Crucially, the throughput remains essentially constant, increasing by less than $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.1\%,$$\end{document}$ across the full threshold range. This is to be expected as all combinations of base particles and additional charged particles will be made regardless of the chosen cut value. The slight increase in throughput at higher thresholds simply reflects fewer events being written, which is a comparatively light computational task.

Although IMI is intrinsically lightweight at the inference stage, its integration introduces additional overheads – dominated by the vertex fits needed to evaluate the input features – resulting in an overall throughput reduction of about $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$20\%$$\end{document}$ in Sprucing. This sets the scale of the reduction shown in Fig. 13 (left). Such an overhead is acceptable at Sprucing, where throughput is far less critical than in Hlt2: for example, the Full stream in Hlt2 operates at input rates of 0.5– $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.5\,{\textrm{MHz}},$$\end{document}$ while Sprucing handles only about $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.04\,{\textrm{MHz}},$$\end{document}$ i.e. more than a factor of thirty lower. In this regime, the modest throughput penalty is clearly outweighed by the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim 45\%$$\end{document}$ reduction in file size, which substantially improves downstream data handling and storage efficiency. We additionally note that the throughput spread decreases as the IMI threshold is tightened. The vertexing needed to evaluate the IMI inputs is performed independently of the threshold and therefore sets an approximately constant baseline cost, while the threshold mainly controls how many extra particles are ultimately persisted. At low thresholds we observe a higher spread driven by a small number of outliers, predominantly due to fluctuations in the load of the machine used for the throughput measurements (i.e. periods when the host is more or less busy). An additional contribution arises from event-dependent variability in packing/serialization and I/O time when many more particles are written out.

Validation in data

While classical isolation algorithms from Run 2 have been successfully adapted and re-designed for Run 3, the IMI algorithm represents a completely new development. Given that much of the current semileptonic physics programme at LHCb, particularly analyses involving excited charm (and charmless) states and so-called “double-charm” decays, relies heavily on this algorithm, a robust validation in data is essential.

Ranking behaviour in real data

The IMI algorithm assigns a score to each extra particle near a selected signal candidate, reflecting how likely it is to originate from the same decay chain as the base particle. To validate that this ranking is physically meaningful, we compare the highest-, second-highest-, and lowest-ranked extra particles in partial Run 3 data $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(0.01\,{\textrm{pb}}^{-1})$$\end{document}$ used to reconstruct $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B^{0}\!\rightarrow D^{*-}\ell ^{+}\nu _{\ell }$$\end{document}$ candidates and in simulation of the same decay. Figure 14 shows the distributions of two representative input features: the cone angle $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta R$$\end{document}$ and the IP $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^2$$\end{document}$ with respect to the SV, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log \!\bigl (\chi ^{2}_{\mathrm {IP\,wrt\,SV}}\bigr ).$$\end{document}$

In both data and simulation, the ordering follows the expected pattern: the most signal-like particles (highest IMI score) concentrate at smaller $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta R$$\end{document}$ and low $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,SV}},$$\end{document}$ while the least signal-like particles are shifted to larger $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta R$$\end{document}$ and larger $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}_{\mathrm {IP\,wrt\,SV}};$$\end{document}$ the second-most signal-like particles lie in between. The separation between ranks is well reproduced, with good overall data–simulation agreement. Differences are visible, with data generally showing broader shapes, leading to slightly more overlap between ranks. These residual differences are driven by components present in data that are not explicitly included in the signal-only simulation, such as combinatorial and mis-identified background and feed-down from higher orbitally excited charm states $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(D^{**})$$\end{document}$ , together with residual mismodelling of underlying-event activity from soft QCD. Overall, the observed behaviour supports that IMI performs the intended physical ranking. The remaining input features show similar agreement and are presented in Appendix E.Fig. 15(Top) $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta M_{D^{*-}}$$\end{document}$ distribution in partial Run 3 data, using the highest-ranked $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi ^-$$\end{document}$ to reconstruct $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D^{*-}$$\end{document}$ mesons from $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bar{D}^{0}$$\end{document}$ candidates. The signal (red) and background (dashed blue) fit components are overlaid. (Bottom) The $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta M_{\Lambda _c^*} = M(\Lambda _{c}^{+}\pi ^{+}\pi ^{-}) - M(\Lambda _c) - M_{PDG}(\Lambda _c)$$\end{document}$ distribution in partial Run 3 data, using the two highest-ranked oppositely charged particles to reconstruct $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{c}^{*+}$$\end{document}$ baryons from $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{c}^{+}$$\end{document}$ candidates. Vertical lines indicate the known $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{c}^{*+}$$\end{document}$ masses from PDG

Reconstructing resonances using IMI-selected particles

A more stringent test of the algorithm’s performance is whether the extra particles selected by IMI can be used to reconstruct well-known resonances, without relying on channel-specific tuning.

We first consider reconstruction of the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D^{*-} \rightarrow \bar{D}^{0} \pi ^{-}$$\end{document}$ decay. Starting from a clean $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bar{D}^{0} \rightarrow K^{+}\pi ^{-}$$\end{document}$ candidate, we combine the single highest-ranked extra particle with negative charge (assumed to be a $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi ^-)$$\end{document}$ with the base particles to form a $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D^{*-}.$$\end{document}$ The resulting $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta M_{D^{*-}} \equiv |M_{D^{*-}} - M_{\bar{D}^{0}}|$$\end{document}$ spectrum, shown in Fig. 15 (top), exhibits a clear, narrow peak on top of a small combinatorial background. Notably, only loose particle identification requirements are applied to the base particles, further highlighting the discriminating power of IMI.

A similar validation is performed in $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{b}^{0} \rightarrow \Lambda _{c}^{*+} \mu ^{-} \bar{\nu }_{\mu }$$\end{document}$ decays, where excited charm baryons $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{c}^{*+} \rightarrow \Lambda _{c}^{+} \pi ^{+} \pi ^{-}$$\end{document}$ are reconstructed by combining the two highest-ranked oppositely charged extra particles with a $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{c}^{+}$$\end{document}$ baryon. The resulting $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta M_{\Lambda _c^*} = M(\Lambda _{c}^{+}\pi ^{+}\pi ^{-}) - M(\Lambda _c) - M_{PDG}(\Lambda _c)$$\end{document}$ distribution, shown in Fig. 15 (bottom), reveals peaks corresponding to the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{c}(2595)^{+}$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{c}(2625)^{+}$$\end{document}$ resonances, as well as a structure around the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{c}(2880)^{+}.$$\end{document}$ Again, these results are obtained with minimal selection, demonstrating that IMI can reliably recover non-isolated signal decay products in data.

Isolation efficiency in data and simulation

While the IMI algorithm performs well qualitatively, it is also important to evaluate its signal efficiency quantitatively, and compare data to simulation. Figure 16 shows the efficiency of the IMI selection (left) and the fraction of charged particles accepted per event (right), as a function of the threshold, computed on simulated $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B^{0} \rightarrow D^{*-} \ell ^{+} \nu _{\ell }$$\end{document}$ events and compared to background-subtracted partial Run 3 data that reconstructs the same decay. At low IMI thresholds, the data–simulation agreement is good for the isolation efficiency, but not for the fraction of accepted charged particles: data consistently retains more particles at all thresholds. This is expected because the data sample contains sizeable backgrounds that are not present in the signal simulation, including combinatorial contributions (only fake $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D^{*}$$\end{document}$ has been subtracted), partially reconstructed decays, mis-identified hadronic decays, and feed-down from excited charm states $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(D^{**})$$\end{document}$ , together with residual mismodelling of underlying-event activity from soft and multiple-parton interactions. At tighter IMI thresholds the isolation efficiency also begins to diverge, pointing to mismodelling of some input-feature distributions in simulation; this could be reduced in future via inclusive simulation-to-data corrections. In practice, the tool is operated with loose working points chosen to retain high signal efficiency, so these discrepancies have minimal impact on downstream physics analyses.Fig. 16. Signal efficiency (left) and fraction of charged particles accepted per event (right) as function of IMI threshold, computed on simulated $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B^{0} \rightarrow D^{*-} e^{+} \nu _{e}$$\end{document}$ events and compared to partial Run 3 data

Summary and outlook

For Run 3, the LHCb experiment faces the demanding task of reducing data rates by up to a factor of eight, imposing stringent constraints not only on which events are selected, but also on the size of each recorded event. While signal decays typically involve just 5–7 charged particles, a typical Run 3 event contains $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(200)$$\end{document}$ reconstructed tracks, with charged particle information alone accounting for over 50% of the total event size. To address this imbalance, a suite of inclusive isolation tools were developed, including classical track-, cone-, and vertex-based methods, alongside a new Inclusive Multivariate Isolation (IMI) algorithm.

As its name implies, IMI is inherently both inclusive in scope and multivariate in structure. It is inclusive in that it has been trained from a wide range of simulated events, encompassing diverse decay topologies and kinematic configurations, including scenarios where signal particles emerge from both short- and long-lived intermediate states within the b-hadron decay chain. Its multivariate nature arises from combining the prominent features of all classical isolation techniques – cone-, vertex-, and track-based – into a single classifier. Built on the fast and lightweight XGBoost framework, IMI assigns a score to each extra particle based on its compatibility with a signal origin, allowing only the most relevant particles to be retained for downstream analysis. This selective retention enables the efficient reconstruction of complex decay chains with varying final-state multiplicities and facilitates the definition of background-enriched control regions, both essential for controlling systematic uncertainties in precision measurements. As of 2025, IMI plays a central role in the LHCb semileptonic physics programme involving missing energy and is well suited for broader application to other decay channels.

The IMI algorithm delivers exceptional performance across a broad range of decay modes and event multiplicities, consistently outperforming classical isolation techniques. It achieves an area under the curve (AUC) of 0.997, demonstrating strong separation power between signal and background particles. At its nominal working point, IMI rejects over 90% of background while retaining approximately $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99\%$$\end{document}$ of signal particles, representing a 2– $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$5\times $$\end{document}$ improvement in background rejection relative to traditional methods in the inclusive sample. Crucially, this performance is preserved even in high-multiplicity environments and across diverse decay topologies, without introducing biases in sensitive kinematic observables such as the momentum transfer squared $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(q^2)$$\end{document}$ in semileptonic decays. By retaining only $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(10)$$\end{document}$ signal-like particles out of the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(200)$$\end{document}$ reconstructed tracks per event, IMI yields a significantly cleaner event representation, leading to a data-size reduction of approximately 45%.

Both the classical and IMI isolation algorithms are fully integrated into the LHCb selection framework, each optimised for a different stage of the data-processing pipeline. The two also operate independently: selection lines using IMI do not rely on candidates preselected with the classical isolation tool. Classical isolation is deployed directly at the trigger level through two complementary approaches: one links signal particles to a minimal set of nearby tracks based on cone isolation, enabling isolation observables to be computed offline; the other computes and stores these observables at the trigger stage itself, allowing for immediate event size reduction with minimal impact on processing throughput. In contrast, IMI adopts a more conservative strategy: the full reconstructed event is written to tape after the trigger stage, and IMI is applied later at the offline Sprucing stage, all while ensuring that the additional memory and throughput cost remains non-significant. This design offers long-term flexibility, as the Sprucing stage can be re-run on triggered events, enabling future updates to IMI without the need to modify or reprocess data at the trigger level.

Validation using Run 3 data confirms that the IMI algorithm performs reliably under real data taking conditions. It produces physically consistent and interpretable rankings of extra particles in the event, enabling the reconstruction of well-known resonances such as the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D^{*+}$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _c^*$$\end{document}$ without requiring dedicated tuning. For the loose working point used in production, the agreement between data and simulation in terms of signal efficiency is excellent across the phase space relevant to most analyses. These results demonstrate good confidence in the use of IMI as a core isolation tool in the LHCb selection framework.

In the near term, planned improvements to IMI include extending isolation to neutral particles (Fig. 1), adding VELO-based features for dense environments, and adopting a multiclass classifier to identify excited heavy-flavour states currently treated with cut-based selections (Appendix B). Looking further ahead, an attractive possibility is a broader architectural role for IMI as a fast pruning layer in multi-stage reconstruction. In this setup, IMI would ease the load on more complex, compute-intensive stages (e.g., GNN-based approaches [41]), helping to keep data volumes manageable in Run 4 and enabling the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(20)$$\end{document}$ reductions anticipated for Run 5 [42, 43]. In conclusion, the demonstrated Run 3 performance and lightweight design make IMI a compelling and forward-compatible building block for scalable reconstruction in future runs.

Bibliography38

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1LH Cb Collaboration, R. Aaij et al., The LH Cb upgrade I. JINST 19(05), P 05065 (2024). 10.1088/1748-0221/19/05/P 05065. ar Xiv:2305.10515 · doi ↗
2LH Cb Collaboration, M. Saur, LH Cb HLT 2: real-time alignment, calibration, and software quality-assurance. Po S ICHEP 2022, 685 (2023). 10.22323/1.414.0685
3N. Schulte et al., Development of the topological trigger for LH Cb Run 3. (2023). ar Xiv:2306.09873
4LH Cb Collaboration, L. Collaboration, Computing model of the upgrade LH Cb experiment. CERN-LHCC-2018-014, LHCB-TDR-018 (2018). http://cdsweb.cern.ch/search?p=CERN-LHCC-2018-014&f=reportnumber&action_search=Search&c=LH Cb
5A. Abdelmotteleb et al., The LH Cb sprucing and analysis productions. Comput. Softw. Big Sci. 9(1), 15 (2025). ar Xiv:2506.2030910.1007/s 41781-025-00144-5PMC 1232166540771570 · doi ↗ · pubmed ↗
6LH Cb Collaboration, P. Li, Real-time analysis in Run 3 with the LH Cb experiment. Po S EPS-HEP 2021, 829 (2022). 10.22323/1.398.0829
7LH Cb Collaboration, R. Aaij et al., Test of lepton flavor universality using B 0D*-+ decays with hadronic channels. Phys. Rev. D 108, 012018 (2023). 10.1103/Phys Rev D.108.012018. ar Xiv:2305.01463. [Erratum: Phys. Rev. D 109, 119902 (2024)]
8LH Cb Collaboration, R. Aaij et al., Measurement of the ratios of branching fractions and . Phys. Rev. Lett. 131, 111802 (2023). 10.1103/Phys Rev Lett.131.111802. ar Xiv:2302.0288610.1103/Phys Rev Lett.131.11180237774262 · doi ↗ · pubmed ↗