Are Markov Models Effective for Storage Reliability Modelling?
Prasenjit Karmakar, K. Gopinath

TL;DR
This paper demonstrates that with careful approximations and advanced modeling techniques, continuous-time Markov chains can effectively model storage system reliability, overcoming traditional limitations like the memoryless property and non-exponential failure distributions.
Contribution
The authors introduce a method to accurately model storage reliability using CTMCs by incorporating non-exponential distributions and state-space reduction, challenging the notion that simulation is the only viable approach.
Findings
CTMC models can approximate non-exponential failure distributions.
The proposed approach reduces computational cost compared to simulation.
Results closely match those obtained from simulation.
Abstract
Continuous Time Markov Chains (CTMC) have been used extensively to model reliability of storage systems. While the exponentially distributed sojourn time of Markov models is widely known to be unrealistic (and it is necessary to consider Weibull-type models for components such as disks), recent work has also highlighted some additional infirmities with the CTMC model, such as the ability to handle repair times. Due to the memoryless property of these models, any failure or repair of one component resets the "clock" to zero with any partial repair or aging in some other subsystem forgotten. It has therefore been argued that simulation is the only accurate technique available for modelling the reliability of a storage system with multiple components. We show how both the above problematic aspects can be handled when we consider a careful set of approximations in a detailed model of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed systems and fault tolerance · Cloud Data Security Solutions
