HybridTune: Spatio-temporal Data and Model Driven Performance Diagnosis for Big Data Systems
Rui Ren, Jiechao Cheng, Xiwen He, Lei Wang, Chunjie Luo, Jianfeng Zhan

TL;DR
HybridTune is a novel tool that uses spatio-temporal correlation analysis and data-driven algorithms to diagnose performance bottlenecks in Big Data systems like Spark and Hadoop, improving efficiency and accuracy.
Contribution
The paper introduces HybridTune, a lightweight, extensible performance diagnosis tool that combines correlation analysis with data and model-driven algorithms for Big Data systems.
Findings
Achieves about 80% accuracy in abnormal/outlier detection.
Effectively identifies bottlenecks like workload imbalance and data skew.
Supports performance analysis on Spark and Hadoop applications.
Abstract
With tremendous growing interests in Big Data systems, analyzing and facilitating their performance improvement become increasingly important. Although there have much research efforts for improving Big Data systems performance, efficiently analysing and diagnosing performance bottlenecks over these massively distributed systems remain a major challenge. In this paper, we propose a spatio-temporal correlation analysis approach based on stage characteristic and distribution characteristic of Big Data applications, which can associate the multi-level performance data fine-grained. On the basis of correlation data, we define some priori rules, select features and vectorize the corresponding datasets for different performance bottlenecks, such as, workload imbalance, data skew, abnormal node and outlier metrics. And then, we utilize the data and model driven algorithms for bottlenecks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Software System Performance and Reliability · Traffic Prediction and Management Techniques
