Towards Learned Predictability of Storage Systems
Chenyuan Wu

TL;DR
This paper surveys machine learning-based approaches for predicting performance issues and failures in storage systems, aiming to improve reliability and reduce latency in datacenter environments.
Contribution
It provides a comprehensive review of recent predictive mechanisms and field studies, analyzing their strengths and limitations in the context of storage system predictability.
Findings
Identifies key machine learning techniques used in storage prediction
Highlights challenges and limitations of current approaches
Discusses future directions for proactive storage system prediction
Abstract
With the rapid development of cloud computing and big data technologies, storage systems have become a fundamental building block of datacenters, incorporating hardware innovations such as flash solid state drives and non-volatile memories, as well as software infrastructures such as RAID and distributed file systems. Despite the growing popularity and interests in storage, designing and implementing reliable storage systems remains challenging, due to their performance instability and prevailing hardware failures. Proactive prediction greatly strengthens the reliability of storage systems. There are two dimensions of prediction: performance and failure. Ideally, through detecting in advance the slow IO requests, and predicting device failures before they really happen, we can build storage systems with especially low tail latency and high availability. While its importance is well…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Cloud Computing and Resource Management · Caching and Content Delivery
