# Predicting Breakdowns in Cloud Services (with SPIKE)

**Authors:** Jianfeng Chen, Joymallya Chakraborty, Philip Clark, Kevin Haverlock,, Snehit Cherian, Tim Menzies

arXiv: 1905.06390 · 2019-06-18

## TL;DR

SPIKE is a data mining tool that predicts cloud service breakdowns 30 minutes in advance, enabling proactive responses to minimize downtime and reputation loss.

## Contribution

The paper introduces SPIKE, a novel predictive system combining regression trees, synthetic over-sampling, hyperparameter tuning, and topology sampling for early failure detection in cloud services.

## Key findings

- SPIKE achieves over 75% recall and precision in predicting service spikes.
- SPIKE outperforms neural nets, random forests, and logistic regression in experiments.
- Early predictions enable organizations to mitigate service failures effectively.

## Abstract

Maintaining web-services is a mission-critical task where any down-time means loss of revenue and reputation (of being a reliable service provider). In the current competitive web services market, such a loss of reputation causes extensive loss of future revenue. To address this issue, we developed SPIKE, a data mining tool which can predict upcoming service breakdowns, half an hour into the future. Such predictions let an organization alert and assemble the tiger team to address the problem (e.g. by reconfiguring cloud hardware in order to reduce the likelihood of that breakdown). SPIKE utilizes (a) regression tree learning (with CART); (b) synthetic minority over-sampling (to handle how rare spikes are in our data); (c) hyperparameter optimization (to learn best settings for our local data) and (d) a technique we called "topology sampling" where training vectors are built from extensive details of an individual node plus summary details on all their neighbors. In the experiments reported here, SPIKE predicted service spikes 30 minutes into future with recalls and precision of 75% and above. Also, SPIKE performed relatively better than other widely-used learning methods (neural nets, random forests, logistic regression).

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.06390/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1905.06390/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/1905.06390/full.md

---
Source: https://tomesphere.com/paper/1905.06390