Ponder: Online Prediction of Task Memory Requirements for Scientific Workflows
Fabian Lehmann, Jonathan Bader, Ninon De Mecquenem, Xing Wang, Vasilis, Bountris, Florian Friederici, Ulf Leser, Lauritz Thamsen

TL;DR
Ponder is an online prediction strategy for scientific workflow tasks that significantly improves memory allocation efficiency, reduces failures, and shortens workflow execution time by adapting to different memory demand patterns.
Contribution
The paper introduces Ponder, a novel online task-sizing method that outperforms existing approaches by considering diverse memory demand patterns in real workflows.
Findings
Memory Allocation Quality improved by 71.0%
Workflow makespan reduced by 21.8%
Task failures decreased by 93.8%
Abstract
Scientific workflows are used to analyze large amounts of data. These workflows comprise numerous tasks, many of which are executed repeatedly, running the same custom program on different inputs. Users specify resource allocations for each task, which must be sufficient for all inputs to prevent task failures. As a result, task memory allocations tend to be overly conservative, wasting precious cluster resources, limiting overall parallelism, and increasing workflow makespan. In this paper, we first benchmark a state-of-the-art method on four real-life workflows from the nf-core workflow repository. This analysis reveals that certain assumptions underlying current prediction methods, which typically were evaluated only on simulated workflows, cannot generally be confirmed for real workflows and executions. We then present Ponder, a new online task-sizing strategy that considers and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management
