Exploring Trade-offs in Dynamic Task Triggering for Loosely Coupled Scientific Workflows
Zhe Wang, Pradeep Subedi, Shaohua Duan, Yubo Qin, Philip Davis,, Anthony Simonet, Ivan Rodero, Manish Parashar

TL;DR
This paper investigates the overheads of data-driven dynamic task triggering in scientific workflows, analyzing various design choices and providing practical guidance for constructing efficient, flexible workflows.
Contribution
It offers a comprehensive evaluation of overheads in dynamic task triggering and presents insights for optimizing data-driven scientific workflows.
Findings
Overheads vary with data size and distribution.
Certain design choices reduce triggering overhead.
Guidelines for constructing efficient workflows are provided.
Abstract
In order to achieve near-time insights, scientific workflows tend to be organized in a flexible and dynamic way. Data-driven triggering of tasks has been explored as a way to support workflows that evolve based on the data. However, the overhead introduced by such dynamic triggering of tasks is an under-studied topic. This paper discusses different facets of dynamic task triggers. Particularly, we explore different ways of constructing a data-driven dynamic workflow and then evaluate the overheads introduced by such design decisions. We evaluate workflows with varying data size, percentage of interesting data, temporal data distribution, and number of tasks triggered. Finally, we provide advice based upon analysis of the evaluation results for users looking to construct data-driven scientific workflows.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Distributed and Parallel Computing Systems · Cloud Computing and Resource Management
