Performance Modeling of Data Storage Systems using Generative Models
Abdalaziz Rashid Al-Maeeni, Aziz Temirkhanov, Artem Ryzhikov, Mikhail, Hushchyn

TL;DR
This paper presents machine learning-based generative models for high-precision performance prediction of storage systems, including HDDs and SSDs, with applications in reliability checking and benchmarking datasets.
Contribution
It introduces probabilistic models for storage components that accurately predict IOPS and latency, and provides new datasets for benchmarking generative and regression models.
Findings
Prediction errors of 4-10% for IOPS and 3-16% for latency.
Pearson correlation up to 0.99 with Little's law.
New datasets for benchmarking regression and generative models.
Abstract
High-precision modeling of systems is one of the main areas of industrial data analysis. Models of systems, their digital twins, are used to predict their behavior under various conditions. We have developed several models of a storage system using machine learning-based generative models. The system consists of several components: hard disk drive (HDD) and solid-state drive (SSD) storage pools with different RAID schemes and cache. Each storage component is represented by a probabilistic model that describes the probability distribution of the component performance in terms of IOPS and latency, depending on their configuration and external data load parameters. The results of the experiments demonstrate the errors of 4-10 % for IOPS and 3-16 % for latency predictions depending on the components and models of the system. The predictions show up to 0.99 Pearson correlation with Little's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Simulation Techniques and Applications · Distributed and Parallel Computing Systems
