RCOMPSs: A Scalable Runtime System for R Code Execution on Manycore Systems
Xiran Zhang, Javier Conejero, Sameh Abdulah, Jorge Ejarque, Ying Sun, Rosa M. Badia, David E. Keyes, Marc G. Genton

TL;DR
RCOMPSs is a scalable runtime system that enables efficient parallel execution of R applications on multicore and manycore systems, improving performance and scalability for large data analysis tasks.
Contribution
It introduces a dynamic, task-based runtime system for R that automates parallel execution and dependency management on high-performance computing systems.
Findings
Achieves strong and weak scalability up to 128 cores per node and 32 nodes.
Maintains over 70% parallel efficiency for KNN and K-means algorithms.
Performs acceptably for linear regression despite complex task dependencies.
Abstract
R has become a cornerstone of scientific and statistical computing due to its extensive package ecosystem, expressive syntax, and strong support for reproducible analysis. However, as data sizes and computational demands grow, native R parallelism support remains limited. This paper presents RCOMPSs, a scalable runtime system that enables efficient parallel execution of R applications on multicore and manycore systems. RCOMPSs adopts a dynamic, task-based programming model, allowing users to write code in a sequential style, while the runtime automatically handles asynchronous task execution, dependency tracking, and scheduling across available resources. We present RCOMPSs using three representative data analysis algorithms, i.e., K-nearest neighbors (KNN) classification, K-means clustering, and linear regression and evaluate their performance on two modern HPC systems: KAUST…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Parallel Computing and Optimization Techniques · Data Analysis with R
