Bioinformatics Computational Cluster Batch Task Profiling with Machine Learning for Failure Prediction
Christopher Harrison, Christine R. Kirkpatrick, In\^es Dutra

TL;DR
This paper presents a machine learning-based profiling system for bioinformatics cluster tasks to predict failures, aiming to improve resource scheduling by understanding IO-bound task behaviors.
Contribution
It introduces a novel machine learning approach for profiling and failure prediction of IO-intensive tasks in bioinformatics clusters, enhancing scheduling efficiency.
Findings
Analyzed 6.7K CPU hours of cluster data over two years.
Developed a machine learning agent for failure prediction.
Improved understanding of IO-bound task failure conditions.
Abstract
Motivation: Traditional computational cluster schedulers are based on user inputs and run time needs request for memory and CPU, not IO. Heavily IO bound task run times, like ones seen in many big data and bioinformatics problems, are dependent on the IO subsystems scheduling and are problematic for cluster resource scheduling. The problematic rescheduling of IO intensive and errant tasks is a lost resource. Understanding the conditions in both successful and failed tasks and differentiating them could provide knowledge to enhancing cluster scheduling and intelligent resource optimization. Results: We analyze a production computational cluster contributing 6.7 thousand CPU hours to research over two years. Through this analysis we develop a machine learning task profiling agent for clusters that attempts to predict failures between identically provision requested tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Genetics, Bioinformatics, and Biomedical Research · Advanced Proteomics Techniques and Applications
