An Optimized Data Structure for High Throughput 3D Proteomics Data: mzRTree
Sara Nasso (1), Francesco Silvestri (1), Francesco Tisiot (1), Barbara, Di Camillo (1), Andrea Pietracaprina (1), Gianna Maria Toffolo (1) ((1), Department of Information Engineering, University of Padova)

TL;DR
The paper introduces mzRTree, a scalable R-tree based data structure that significantly improves the efficiency of storing and querying high-throughput LC-MS proteomics data, reducing computational costs.
Contribution
It presents mzRTree, a novel, space-efficient index structure optimized for fast range queries on large LC-MS datasets, outperforming existing methods.
Findings
mzRTree outperforms existing data structures in range query speed.
mzRTree is more space-efficient than current solutions.
It reduces computational costs for large proteomics datasets.
Abstract
As an emerging field, MS-based proteomics still requires software tools for efficiently storing and accessing experimental data. In this work, we focus on the management of LC-MS data, which are typically made available in standard XML-based portable formats. The structures that are currently employed to manage these data can be highly inefficient, especially when dealing with high-throughput profile data. LC-MS datasets are usually accessed through 2D range queries. Optimizing this type of operation could dramatically reduce the complexity of data analysis. We propose a novel data structure for LC-MS datasets, called mzRTree, which embodies a scalable index based on the R-tree data structure. mzRTree can be efficiently created from the XML-based data formats and it is suitable for handling very large datasets. We experimentally show that, on all range queries, mzRTree outperforms other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
