Deep RC: A Scalable Data Engineering and Deep Learning Pipeline
Arup Kumar Sarker, Aymen Alsaadi, Alexander James Halpern, Prabhath, Tangella, Mikhail Titov, Niranda Perera, Mills Staylor, Gregor von Laszewski,, Shantenu Jha, and Geoffrey Fox

TL;DR
Deep RC is a scalable, heterogeneous runtime system that integrates data engineering, deep learning, and workflow management across HPC and cloud platforms, enabling efficient end-to-end scientific data pipelines.
Contribution
It introduces Deep RC, a novel heterogeneous runtime system that unifies data engineering and deep learning workflows on HPC and cloud infrastructures.
Findings
Reduces preprocessing and training times significantly
Supports multiple accelerators and communication libraries
Demonstrates strong performance on cloud and HPC systems
Abstract
Significant obstacles exist in scientific domains including genetics, climate modeling, and astronomy due to the management, preprocess, and training on complicated data for deep learning. Even while several large-scale solutions offer distributed execution environments, open-source alternatives that integrate scalable runtime tools, deep learning and data frameworks on high-performance computing platforms remain crucial for accessibility and flexibility. In this paper, we introduce Deep Radical-Cylon(RC), a heterogeneous runtime system that combines data engineering, deep learning frameworks, and workflow engines across several HPC environments, including cloud and supercomputing infrastructures. Deep RC supports heterogeneous systems with accelerators, allows the usage of communication libraries like MPI, GLOO and NCCL across multi-node setups, and facilitates parallel and distributed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
