Hybrid Cloud and HPC Approach to High-Performance Dataframes
Kaiying Shan, Niranda Perera, Damitha Lenadora, Tianle Zhong, Arup, Sarker, Supun Kamburugamuve, Thejaka Amila Kanewela, Chathura Widanage, and, Geoffrey Fox

TL;DR
This paper presents a hybrid cloud and HPC approach to enhance distributed dataframes, integrating UCX to improve compatibility and performance in diverse environments beyond traditional MPI-based systems.
Contribution
We integrated UCX as a communication layer in Cylon to address MPI compatibility issues, enabling better deployment in varied HPC and cloud environments.
Findings
UCX integration improves compatibility with non-MPI environments
Enhanced performance in distributed data processing tasks
Method applicable to other MPI-dependent applications
Abstract
Data pre-processing is a fundamental component in any data-driven application. With the increasing complexity of data processing operations and volume of data, Cylon, a distributed dataframe system, is developed to facilitate data processing both as a standalone application and as a library, especially for Python applications. While Cylon shows promising performance results, we experienced difficulties trying to integrate with frameworks incompatible with the traditional Message Passing Interface (MPI). While MPI implementations encompass scalable and efficient communication routines, their process launching mechanisms work well with mainstream HPC systems but are incompatible with some environments that adopt their own resource management systems. In this work, we alleviated this issue by directly integrating the Unified Communication X (UCX) framework, which supports a variety of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Distributed and Parallel Computing Systems · Cloud Computing and Resource Management
