Realizing Fast, Scalable and Reliable Scientific Computations in Grid Environments
Yong Zhao, Ioan Raicu, Ian Foster, Mihael Hategan, Veronika Nefedova,, Mike Wilde

TL;DR
This paper presents an integrated system combining Swift, Karajan, and Falkon to enable fast, scalable, and reliable execution of large-scale scientific workflows in heterogeneous Grid environments, demonstrating significant performance improvements.
Contribution
The paper introduces an integrated system that combines Swift, Karajan, and Falkon to efficiently manage and execute large-scale scientific computations in Grid environments, addressing heterogeneity and dynamic workflows.
Findings
Achieves up to 90% reduction in execution time compared to traditional schedulers.
Demonstrates scalability and reliability across astronomy, neuroscience, and molecular dynamics applications.
Supports dynamic workflows with reduced code complexity using SwiftScript.
Abstract
The practical realization of managing and executing large scale scientific computations efficiently and reliably is quite challenging. Scientific computations often involve thousands or even millions of tasks operating on large quantities of data, such data are often diversely structured and stored in heterogeneous physical formats, and scientists must specify and run such computations over extended periods on collections of compute, storage and network resources that are heterogeneous, distributed and may change constantly. We present the integration of several advanced systems: Swift, Karajan, and Falkon, to address the challenges in running various large scale scientific applications in Grid environments. Swift is a parallel programming tool for rapid and reliable specification, execution, and management of large-scale science and engineering workflows. Swift consists of a simple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Scientific Computing and Data Management · Parallel Computing and Optimization Techniques
