Mining Scientific Workflows for Anomalous Data Transfers
Huy Tu, George Papadimitriou, Mariam Kiran, Cong Wang, Anirban Mandal,, Ewa Deelman, and Tim Menzies

TL;DR
This paper introduces X-FLASH, a machine learning tool that improves detection of network anomalies in scientific workflows, significantly enhancing accuracy through hyperparameter tuning and data mining techniques.
Contribution
The paper presents a novel anomaly detection method combining XGBoost with a sequential optimizer, achieving up to 40% improvement over existing approaches.
Findings
X-FLASH outperforms previous methods in F-measure, G-score, and recall.
Hyperparameter tuning significantly enhances detection performance.
The approach requires fewer evaluations, making it efficient.
Abstract
Modern scientific workflows are data-driven and are often executed on distributed, heterogeneous, high-performance computing infrastructures. Anomalies and failures in the workflow execution cause loss of scientific productivity and inefficient use of the infrastructure. Hence, detecting, diagnosing, and mitigating these anomalies are immensely important for reliable and performant scientific workflows. Since these workflows rely heavily on high-performance network transfers that require strict QoS constraints, accurately detecting anomalous network performance is crucial to ensure reliable and efficient workflow execution. To address this challenge, we have developed X-FLASH, a network anomaly detection tool for faulty TCP workflow transfers. X-FLASH incorporates novel hyperparameter tuning and data mining approaches for improving the performance of the machine learning algorithms to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Software System Performance and Reliability · Scientific Computing and Data Management
