A Unified Approach to Concurrent, Parallel Map-Reduce in R using Futures
Henrik Bengtsson

TL;DR
The paper introduces the futurize package for R, which simplifies parallelizing diverse map-reduce operations by providing a unified, easy-to-use interface that minimizes code refactoring and supports multiple frameworks.
Contribution
It presents a unified approach to parallel map-reduce in R through the futurize package, enabling seamless conversion of sequential code to parallel execution across various APIs.
Findings
Supports multiple map-reduce APIs in R
Enables minimal code changes for parallelization
Simplifies parallel computing with a unified interface
Abstract
The R ecosystem offers a rich variety of map-reduce application programming interfaces (APIs) for iterative computations, yet parallelizing code across these diverse frameworks requires learning multiple, often incompatible, parallel APIs. The futurize package addresses this challenge by providing a single function, futurize(), which transpiles sequential map-reduce expressions into their parallel equivalents in the future ecosystem, which performs all the heavy lifting. By leveraging R's native pipe operator, users can parallelize existing code with minimal refactoring -- often by simply appending `|> futurize()' to an expression. The package supports classical map-reduce functions from base R, purrr, crossmap, foreach, plyr, BiocParallel, e.g., lapply(xs, fcn) |> futurize() and map(xs, fcn) |> futurize(), as well as a growing set of domain-specific packages, e.g., boot, caret, glmnet,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Data Analysis with R · Distributed and Parallel Computing Systems
