The Life and Death of SSDs and HDDs: Similarities, Differences, and Prediction Models
Riccardo Pinciroli, Lishan Yang, Jacob Alter, Evgenia Smirni

TL;DR
This study compares HDDs and SSDs in data centers, analyzing failure causes over six years, and develops machine learning models that accurately predict failures and reveal underlying root causes based on workload and device data.
Contribution
It provides a comprehensive comparative analysis of HDD and SSD failures using large-scale field data and introduces effective machine learning models for failure prediction and root cause analysis.
Findings
HDD failures are linked to head positioning time rather than age.
SSD failures show high infant mortality rates.
Machine learning models achieve high recall and low false positives in failure prediction.
Abstract
Data center downtime typically centers around IT equipment failure. Storage devices are the most frequently failing components in data centers. We present a comparative study of hard disk drives (HDDs) and solid state drives (SSDs) that constitute the typical storage in data centers. Using a six-year field data of 100,000 HDDs of different models from the same manufacturer from the BackBlaze dataset and a six-year field data of 30,000 SSDs of three models from a Google data center, we characterize the workload conditions that lead to failures and illustrate that their root causes differ from common expectation but remain difficult to discern. For the case of HDDs we observe that young and old drives do not present many differences in their failures. Instead, failures may be distinguished by discriminating drives based on the time spent for head positioning. For SSDs, we observe high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Cloud Computing and Resource Management · Data Stream Mining Techniques
