Efficient Training Approaches for Performance Anomaly Detection Models in Edge Computing Environments
Duneesha Fernando, Maria A. Rodriguez, Patricia Arroba, Leila Ismail,, Rajkumar Buyya

TL;DR
This paper introduces two clustering-based training methods for performance anomaly detection in edge computing, balancing accuracy and efficiency amidst resource constraints and large device counts.
Contribution
It proposes novel clustering-based training approaches, ICPTL and CM, that improve training efficiency while maintaining high detection accuracy in edge environments.
Findings
ICPTL achieves similar accuracy to device-specific models with 40% less training time.
CM reduces training time by 23% and the number of models by 66%, outperforming a single general model.
Both methods effectively balance detection accuracy and training efficiency in resource-constrained edge settings.
Abstract
Microservice architectures are increasingly used to modularize IoT applications and deploy them in distributed and heterogeneous edge computing environments. Over time, these microservice-based IoT applications are susceptible to performance anomalies caused by resource hogging (e.g., CPU or memory), resource contention, etc., which can negatively impact their Quality of Service and violate their Service Level Agreements. Existing research on performance anomaly detection for edge computing environments focuses on model training approaches that either achieve high accuracy at the expense of a time-consuming and resource-intensive training process or prioritize training efficiency at the cost of lower accuracy. To address this gap, while considering the resource constraints and the large number of devices in modern edge platforms, we propose two clustering-based model training approaches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Cloud Computing and Resource Management · IoT and Edge/Fog Computing
