DeepFT: Fault-Tolerant Edge Computing using a Self-Supervised Deep   Surrogate Model

Shreshth Tuli; Giuliano Casale; Ludmila Cherkasova; Nicholas; R. Jennings

arXiv:2212.01302·cs.DC·December 5, 2022·1 cites

DeepFT: Fault-Tolerant Edge Computing using a Self-Supervised Deep Surrogate Model

Shreshth Tuli, Giuliano Casale, Ludmila Cherkasova, Nicholas, R. Jennings

PDF

Open Access

TL;DR

DeepFT is a scalable, self-supervised deep surrogate model designed for fault prediction and diagnosis in resource-constrained edge computing, significantly improving fault detection accuracy and QoS metrics.

Contribution

It introduces a novel self-supervised learning approach with a deep surrogate model for proactive fault tolerance in edge computing environments.

Findings

01

Outperforms baseline methods in fault detection accuracy.

02

Reduces service deadline violations by up to 37%.

03

Improves response time by up to 9%.

Abstract

The emergence of latency-critical AI applications has been supported by the evolution of the edge computing paradigm. However, edge solutions are typically resource-constrained, posing reliability challenges due to heightened contention for compute and communication capacities and faulty application behavior in the presence of overload conditions. Although a large amount of generated log data can be mined for fault prediction, labeling this data for training is a manual process and thus a limiting factor for automation. Due to this, many companies resort to unsupervised fault-tolerance models. Yet, failure models of this kind can incur a loss of accuracy when they need to adapt to non-stationary workloads and diverse host characteristics. To cope with this, we propose a novel modeling approach, called DeepFT, to proactively avoid system overloads and their adverse effects by optimizing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Age of Information Optimization · Cloud Computing and Resource Management

Methodstravel james