SkimROOT: Accelerating LHC Data Filtering with Near-Storage Processing
Narangerelt Batsoyol, Jonathan Guiang, Diego Davila, Aashay Arora, Philip Chang, Frank W\"urthwein, Steven Swanson

TL;DR
SkimROOT is a near-storage processing system that uses Data Processing Units to filter LHC data directly on storage servers, significantly reducing data transfer bottlenecks and accelerating high-energy physics data analysis.
Contribution
We propose SkimROOT, a novel near-data filtering system utilizing DPUs to improve LHC data analysis efficiency by minimizing data movement.
Findings
Achieved 44.3× performance improvement.
Reduced data transfer delays in LHC data filtering.
Demonstrated effectiveness of near-storage processing in HEP.
Abstract
Data analysis in high-energy physics (HEP) begins with data reduction, where vast datasets are filtered to extract relevant events. At the Large Hadron Collider (LHC), this process is bottlenecked by slow data transfers between storage and compute nodes. To address this, we introduce SkimROOT, a near-data filtering system leveraging Data Processing Units (DPUs) to accelerate LHC data analysis. By performing filtering directly on storage servers and returning only the relevant data, SkimROOT minimizes data movement and reduces processing delays. Our prototype demonstrates significant efficiency gains, achieving a 44.3 performance improvement, paving the way for faster physics discoveries.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed and Parallel Computing Systems · Cloud Computing and Resource Management
