TFBEST: Dual-Aspect Transformer with Learnable Positional Encoding for Failure Prediction
Rohan Mohapatra, Saptarshi Sengupta

TL;DR
This paper introduces TFBEST, a transformer-based model with learnable positional encoding, for improved failure prediction of HDDs, providing better confidence intervals and handling long log sequences efficiently.
Contribution
The paper proposes a novel transformer architecture, TFBEST, that enhances RUL prediction accuracy and confidence estimation for HDD failures, outperforming existing CNN/RNN models.
Findings
Significantly outperforms state-of-the-art RUL prediction methods.
Effectively processes very long sequences of S.M.A.R.T. logs.
Provides a confidence margin statistic for failure time estimation.
Abstract
Hard Disk Drive (HDD) failures in datacenters are costly - from catastrophic data loss to a question of goodwill, stakeholders want to avoid it like the plague. An important tool in proactively monitoring against HDD failure is timely estimation of the Remaining Useful Life (RUL). To this end, the Self-Monitoring, Analysis and Reporting Technology employed within HDDs (S.M.A.R.T.) provide critical logs for long-term maintenance of the security and dependability of these essential data storage devices. Data-driven predictive models in the past have used these S.M.A.R.T. logs and CNN/RNN based architectures heavily. However, they have suffered significantly in providing a confidence interval around the predicted RUL values as well as in processing very long sequences of logs. In addition, some of these approaches, such as those based on LSTMs, are inherently slow to train and have tedious…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Software System Performance and Reliability · Cloud Computing and Resource Management
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Adam · Byte Pair Encoding · Softmax · Dropout · Label Smoothing · Absolute Position Encodings
